Overview

Dataset statistics

Number of variables40
Number of observations100000
Missing cells553075
Missing cells (%)13.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory25.2 MiB
Average record size in memory264.0 B

Variable types

Categorical27
Numeric13

Alerts

C_1 has a high cardinality: 541 distinct values High cardinality
C_2 has a high cardinality: 497 distinct values High cardinality
C_3 has a high cardinality: 43869 distinct values High cardinality
C_4 has a high cardinality: 25183 distinct values High cardinality
C_5 has a high cardinality: 145 distinct values High cardinality
C_7 has a high cardinality: 7623 distinct values High cardinality
C_8 has a high cardinality: 257 distinct values High cardinality
C_10 has a high cardinality: 10997 distinct values High cardinality
C_11 has a high cardinality: 3799 distinct values High cardinality
C_12 has a high cardinality: 41311 distinct values High cardinality
C_13 has a high cardinality: 2796 distinct values High cardinality
C_15 has a high cardinality: 5238 distinct values High cardinality
C_16 has a high cardinality: 34616 distinct values High cardinality
C_18 has a high cardinality: 2548 distinct values High cardinality
C_19 has a high cardinality: 1302 distinct values High cardinality
C_21 has a high cardinality: 38617 distinct values High cardinality
C_24 has a high cardinality: 12334 distinct values High cardinality
C_26 has a high cardinality: 9526 distinct values High cardinality
I_1 is highly correlated with I_5 and 2 other fieldsHigh correlation
I_4 is highly correlated with I_8 and 1 other fieldsHigh correlation
I_5 is highly correlated with I_1 and 2 other fieldsHigh correlation
I_6 is highly correlated with I_1 and 2 other fieldsHigh correlation
I_7 is highly correlated with I_11High correlation
I_8 is highly correlated with I_4 and 1 other fieldsHigh correlation
I_10 is highly correlated with I_1 and 2 other fieldsHigh correlation
I_11 is highly correlated with I_7High correlation
I_13 is highly correlated with I_4 and 1 other fieldsHigh correlation
I_7 is highly correlated with I_11High correlation
I_8 is highly correlated with I_13High correlation
I_11 is highly correlated with I_7High correlation
I_13 is highly correlated with I_8High correlation
I_1 is highly correlated with I_5 and 1 other fieldsHigh correlation
I_4 is highly correlated with I_13High correlation
I_5 is highly correlated with I_1 and 1 other fieldsHigh correlation
I_6 is highly correlated with I_10High correlation
I_7 is highly correlated with I_11High correlation
I_10 is highly correlated with I_1 and 2 other fieldsHigh correlation
I_11 is highly correlated with I_7High correlation
I_13 is highly correlated with I_4High correlation
I_1 is highly correlated with I_7High correlation
I_4 is highly correlated with I_8 and 1 other fieldsHigh correlation
I_5 is highly correlated with C_9High correlation
I_7 is highly correlated with I_1High correlation
I_8 is highly correlated with I_4 and 1 other fieldsHigh correlation
I_13 is highly correlated with I_4 and 1 other fieldsHigh correlation
C_6 is highly correlated with C_22High correlation
C_9 is highly correlated with I_5 and 1 other fieldsHigh correlation
C_14 is highly correlated with C_23High correlation
C_17 is highly correlated with C_9High correlation
C_22 is highly correlated with C_6High correlation
C_23 is highly correlated with C_14High correlation
I_1 has 44413 (44.4%) missing values Missing
I_3 has 19102 (19.1%) missing values Missing
I_4 has 19534 (19.5%) missing values Missing
I_5 has 4760 (4.8%) missing values Missing
I_6 has 25107 (25.1%) missing values Missing
I_7 has 4719 (4.7%) missing values Missing
I_9 has 4719 (4.7%) missing values Missing
I_10 has 44413 (44.4%) missing values Missing
I_11 has 4719 (4.7%) missing values Missing
I_12 has 77180 (77.2%) missing values Missing
I_13 has 19534 (19.5%) missing values Missing
C_3 has 3935 (3.9%) missing values Missing
C_4 has 3935 (3.9%) missing values Missing
C_6 has 13708 (13.7%) missing values Missing
C_12 has 3935 (3.9%) missing values Missing
C_16 has 3935 (3.9%) missing values Missing
C_19 has 41471 (41.5%) missing values Missing
C_20 has 41471 (41.5%) missing values Missing
C_21 has 3935 (3.9%) missing values Missing
C_22 has 81566 (81.6%) missing values Missing
C_24 has 3935 (3.9%) missing values Missing
C_25 has 41471 (41.5%) missing values Missing
C_26 has 41471 (41.5%) missing values Missing
I_3 is highly skewed (γ1 = 74.67720032) Skewed
I_7 is highly skewed (γ1 = 42.4131813) Skewed
I_8 is highly skewed (γ1 = 68.85501099) Skewed
I_12 is highly skewed (γ1 = 42.70809937) Skewed
I_13 is highly skewed (γ1 = 67.00427246) Skewed
I_1 has 24287 (24.3%) zeros Zeros
I_2 has 18010 (18.0%) zeros Zeros
I_4 has 5336 (5.3%) zeros Zeros
I_5 has 1828 (1.8%) zeros Zeros
I_6 has 4890 (4.9%) zeros Zeros
I_7 has 24234 (24.2%) zeros Zeros
I_8 has 11159 (11.2%) zeros Zeros
I_9 has 3013 (3.0%) zeros Zeros
I_10 has 26171 (26.2%) zeros Zeros
I_11 has 25355 (25.4%) zeros Zeros
I_12 has 16957 (17.0%) zeros Zeros
I_13 has 5200 (5.2%) zeros Zeros

Reproduction

Analysis started2022-08-03 12:39:48.799111
Analysis finished2022-08-03 12:40:06.395209
Duration17.6 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

label
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.4 KiB
0
77337 
1
22663 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters100000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
077337
77.3%
122663
 
22.7%

Length

2022-08-03T20:40:06.413500image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-03T20:40:06.447600image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
077337
77.3%
122663
 
22.7%

Most occurring characters

ValueCountFrequency (%)
077337
77.3%
122663
 
22.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number100000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
077337
77.3%
122663
 
22.7%

Most occurring scripts

ValueCountFrequency (%)
Common100000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
077337
77.3%
122663
 
22.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII100000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
077337
77.3%
122663
 
22.7%

I_1
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct152
Distinct (%)0.3%
Missing44413
Missing (%)44.4%
Infinite0
Infinite (%)0.0%
Mean3.768722903
Minimum0
Maximum556
Zeros24287
Zeros (%)24.3%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2022-08-03T20:40:06.551186image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q33
95-th percentile17
Maximum556
Range556
Interquartile range (IQR)3

Descriptive statistics

Standard deviation10.45120811
Coefficient of variation (CV)2.773143153
Kurtosis373.7926331
Mean3.768722903
Median Absolute Deviation (MAD)1
Skewness13.0450964
Sum209492
Variance109.2277527
MonotonicityNot monotonic
2022-08-03T20:40:06.591112image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
024287
24.3%
19337
 
9.3%
25103
 
5.1%
33289
 
3.3%
42375
 
2.4%
51687
 
1.7%
61364
 
1.4%
7993
 
1.0%
8919
 
0.9%
9674
 
0.7%
Other values (142)5559
 
5.6%
(Missing)44413
44.4%
ValueCountFrequency (%)
024287
24.3%
19337
 
9.3%
25103
 
5.1%
33289
 
3.3%
42375
 
2.4%
51687
 
1.7%
61364
 
1.4%
7993
 
1.0%
8919
 
0.9%
9674
 
0.7%
ValueCountFrequency (%)
5561
< 0.1%
4661
< 0.1%
3631
< 0.1%
3521
< 0.1%
3261
< 0.1%
3141
< 0.1%
2871
< 0.1%
2751
< 0.1%
2551
< 0.1%
2291
< 0.1%

I_2
Real number (ℝ)

ZEROS

Distinct2693
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.86373
Minimum-2
Maximum18522
Zeros18010
Zeros (%)18.0%
Negative9920
Negative (%)9.9%
Memory size390.8 KiB
2022-08-03T20:40:06.634427image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-2
5-th percentile-1
Q10
median3
Q340
95-th percentile587
Maximum18522
Range18524
Interquartile range (IQR)40

Descriptive statistics

Standard deviation401.5226135
Coefficient of variation (CV)3.557587664
Kurtosis100.7127914
Mean112.86373
Median Absolute Deviation (MAD)4
Skewness7.263789177
Sum11286373
Variance161220.4062
MonotonicityNot monotonic
2022-08-03T20:40:06.676434image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
018010
18.0%
115343
 
15.3%
-19847
 
9.8%
25633
 
5.6%
32828
 
2.8%
42029
 
2.0%
51659
 
1.7%
61416
 
1.4%
71202
 
1.2%
81070
 
1.1%
Other values (2683)40963
41.0%
ValueCountFrequency (%)
-273
 
0.1%
-19847
9.8%
018010
18.0%
115343
15.3%
25633
 
5.6%
32828
 
2.8%
42029
 
2.0%
51659
 
1.7%
61416
 
1.4%
71202
 
1.2%
ValueCountFrequency (%)
185221
< 0.1%
111781
< 0.1%
92051
< 0.1%
81801
< 0.1%
80711
< 0.1%
80161
< 0.1%
78641
< 0.1%
77961
< 0.1%
75511
< 0.1%
69421
< 0.1%

I_3
Real number (ℝ≥0)

MISSING
SKEWED

Distinct943
Distinct (%)1.2%
Missing19102
Missing (%)19.1%
Infinite0
Infinite (%)0.0%
Mean40.74491335
Minimum0
Maximum65535
Zeros581
Zeros (%)0.6%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2022-08-03T20:40:06.721261image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q13
median8
Q323
95-th percentile109
Maximum65535
Range65535
Interquartile range (IQR)20

Descriptive statistics

Standard deviation538.8188477
Coefficient of variation (CV)13.22419913
Kurtosis7426.687012
Mean40.74491335
Median Absolute Deviation (MAD)6
Skewness74.67720032
Sum3296182
Variance290325.75
MonotonicityNot monotonic
2022-08-03T20:40:06.763761image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
111371
 
11.4%
27869
 
7.9%
35836
 
5.8%
44772
 
4.8%
53879
 
3.9%
63215
 
3.2%
72779
 
2.8%
82518
 
2.5%
92133
 
2.1%
101913
 
1.9%
Other values (933)34613
34.6%
(Missing)19102
19.1%
ValueCountFrequency (%)
0581
 
0.6%
111371
11.4%
27869
7.9%
35836
5.8%
44772
4.8%
53879
 
3.9%
63215
 
3.2%
72779
 
2.8%
82518
 
2.5%
92133
 
2.1%
ValueCountFrequency (%)
655352
< 0.1%
555031
< 0.1%
258711
< 0.1%
230921
< 0.1%
216391
< 0.1%
216361
< 0.1%
215681
< 0.1%
215371
< 0.1%
215171
< 0.1%
214911
< 0.1%

I_4
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct135
Distinct (%)0.2%
Missing19534
Missing (%)19.5%
Infinite0
Infinite (%)0.0%
Mean8.280317153
Minimum0
Maximum417
Zeros5336
Zeros (%)5.3%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2022-08-03T20:40:06.809308image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median5
Q311
95-th percentile29
Maximum417
Range417
Interquartile range (IQR)9

Descriptive statistics

Standard deviation10.83633614
Coefficient of variation (CV)1.308686121
Kurtosis115.5767212
Mean8.280317153
Median Absolute Deviation (MAD)4
Skewness5.891324997
Sum666284
Variance117.4261856
MonotonicityNot monotonic
2022-08-03T20:40:06.850131image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
112264
12.3%
29513
9.5%
37110
 
7.1%
45833
 
5.8%
05336
 
5.3%
54663
 
4.7%
64041
 
4.0%
73321
 
3.3%
82888
 
2.9%
92457
 
2.5%
Other values (125)23040
23.0%
(Missing)19534
19.5%
ValueCountFrequency (%)
05336
5.3%
112264
12.3%
29513
9.5%
37110
7.1%
45833
5.8%
54663
 
4.7%
64041
 
4.0%
73321
 
3.3%
82888
 
2.9%
92457
 
2.5%
ValueCountFrequency (%)
4171
< 0.1%
4111
< 0.1%
3801
< 0.1%
2801
< 0.1%
2751
< 0.1%
2681
< 0.1%
2551
< 0.1%
2531
< 0.1%
2481
< 0.1%
2311
< 0.1%

I_5
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct23041
Distinct (%)24.2%
Missing4760
Missing (%)4.8%
Infinite0
Infinite (%)0.0%
Mean17592.5994
Minimum0
Maximum1741128
Zeros1828
Zeros (%)1.8%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2022-08-03T20:40:06.894234image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q1228
median2213
Q310209
95-th percentile55473.25
Maximum1741128
Range1741128
Interquartile range (IQR)9981

Descriptive statistics

Standard deviation65797.89844
Coefficient of variation (CV)3.740089622
Kurtosis115.0488815
Mean17592.5994
Median Absolute Deviation (MAD)2199
Skewness9.146624565
Sum1675519167
Variance4329362944
MonotonicityNot monotonic
2022-08-03T20:40:06.936607image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12097
 
2.1%
01828
 
1.8%
21473
 
1.5%
41092
 
1.1%
5908
 
0.9%
7772
 
0.8%
8634
 
0.6%
10602
 
0.6%
11467
 
0.5%
12430
 
0.4%
Other values (23031)84937
84.9%
(Missing)4760
 
4.8%
ValueCountFrequency (%)
01828
1.8%
12097
2.1%
21473
1.5%
41092
1.1%
5908
0.9%
7772
 
0.8%
8634
 
0.6%
10602
 
0.6%
11467
 
0.5%
12430
 
0.4%
ValueCountFrequency (%)
17411281
< 0.1%
16181121
< 0.1%
15723611
< 0.1%
15519291
< 0.1%
15307181
< 0.1%
15306521
< 0.1%
14666271
< 0.1%
14245311
< 0.1%
13867321
< 0.1%
13610701
< 0.1%

I_6
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct2055
Distinct (%)2.7%
Missing25107
Missing (%)25.1%
Infinite0
Infinite (%)0.0%
Mean139.6850841
Minimum0
Maximum16290
Zeros4890
Zeros (%)4.9%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2022-08-03T20:40:06.981253image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q19
median37
Q3122
95-th percentile594
Maximum16290
Range16290
Interquartile range (IQR)113

Descriptive statistics

Standard deviation371.7760925
Coefficient of variation (CV)2.661530363
Kurtosis251.314621
Mean139.6850841
Median Absolute Deviation (MAD)34
Skewness11.40925312
Sum10461435
Variance138217.4688
MonotonicityNot monotonic
2022-08-03T20:40:07.024147image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
04890
 
4.9%
12416
 
2.4%
22087
 
2.1%
31835
 
1.8%
41610
 
1.6%
51477
 
1.5%
61385
 
1.4%
71226
 
1.2%
81209
 
1.2%
101063
 
1.1%
Other values (2045)55695
55.7%
(Missing)25107
25.1%
ValueCountFrequency (%)
04890
4.9%
12416
2.4%
22087
2.1%
31835
 
1.8%
41610
 
1.6%
51477
 
1.5%
61385
 
1.4%
71226
 
1.2%
81209
 
1.2%
91039
 
1.0%
ValueCountFrequency (%)
162901
< 0.1%
156581
< 0.1%
135601
< 0.1%
121671
< 0.1%
104791
< 0.1%
102611
< 0.1%
99241
< 0.1%
94861
< 0.1%
91771
< 0.1%
91151
< 0.1%

I_7
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
SKEWED
ZEROS

Distinct628
Distinct (%)0.7%
Missing4719
Missing (%)4.7%
Infinite0
Infinite (%)0.0%
Mean15.22209045
Minimum0
Maximum8807
Zeros24234
Zeros (%)24.2%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2022-08-03T20:40:07.070701image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median3
Q311
95-th percentile61
Maximum8807
Range8807
Interquartile range (IQR)11

Descriptive statistics

Standard deviation65.46048737
Coefficient of variation (CV)4.300361214
Kurtosis4233.842285
Mean15.22209045
Median Absolute Deviation (MAD)3
Skewness42.4131813
Sum1450376
Variance4285.075684
MonotonicityNot monotonic
2022-08-03T20:40:07.110801image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
024234
24.2%
113333
13.3%
28373
 
8.4%
35656
 
5.7%
44702
 
4.7%
53717
 
3.7%
63115
 
3.1%
72490
 
2.5%
82240
 
2.2%
91875
 
1.9%
Other values (618)25546
25.5%
(Missing)4719
 
4.7%
ValueCountFrequency (%)
024234
24.2%
113333
13.3%
28373
 
8.4%
35656
 
5.7%
44702
 
4.7%
53717
 
3.7%
63115
 
3.1%
72490
 
2.5%
82240
 
2.2%
91875
 
1.9%
ValueCountFrequency (%)
88071
< 0.1%
55731
< 0.1%
32791
< 0.1%
29631
< 0.1%
25081
< 0.1%
22401
< 0.1%
20611
< 0.1%
19761
< 0.1%
19501
< 0.1%
19211
< 0.1%

I_8
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct155
Distinct (%)0.2%
Missing107
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean13.57482506
Minimum0
Maximum4677
Zeros11159
Zeros (%)11.2%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2022-08-03T20:40:07.151623image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median8
Q320
95-th percentile42
Maximum4677
Range4677
Interquartile range (IQR)18

Descriptive statistics

Standard deviation46.54159546
Coefficient of variation (CV)3.428522669
Kurtosis5577.802734
Mean13.57482506
Median Absolute Deviation (MAD)7
Skewness68.85501099
Sum1356030
Variance2166.120117
MonotonicityNot monotonic
2022-08-03T20:40:07.194708image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
011159
 
11.2%
18075
 
8.1%
26798
 
6.8%
35671
 
5.7%
45196
 
5.2%
54394
 
4.4%
64244
 
4.2%
73839
 
3.8%
82909
 
2.9%
92826
 
2.8%
Other values (145)44782
44.8%
ValueCountFrequency (%)
011159
11.2%
18075
8.1%
26798
6.8%
35671
5.7%
45196
5.2%
54394
 
4.4%
64244
 
4.2%
73839
 
3.8%
82909
 
2.9%
92826
 
2.8%
ValueCountFrequency (%)
46771
< 0.1%
46031
< 0.1%
43521
< 0.1%
39671
< 0.1%
39641
< 0.1%
37471
< 0.1%
33431
< 0.1%
32201
< 0.1%
31931
< 0.1%
29941
< 0.1%

I_9
Real number (ℝ≥0)

MISSING
ZEROS

Distinct1942
Distinct (%)2.0%
Missing4719
Missing (%)4.7%
Infinite0
Infinite (%)0.0%
Mean125.2949066
Minimum0
Maximum12661
Zeros3013
Zeros (%)3.0%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2022-08-03T20:40:07.237427image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q110
median40
Q3120
95-th percentile524
Maximum12661
Range12661
Interquartile range (IQR)110

Descriptive statistics

Standard deviation286.4156799
Coefficient of variation (CV)2.285932346
Kurtosis163.1750031
Mean125.2949066
Median Absolute Deviation (MAD)36
Skewness9.009661674
Sum11938224
Variance82033.94531
MonotonicityNot monotonic
2022-08-03T20:40:07.282931image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13877
 
3.9%
23135
 
3.1%
03013
 
3.0%
32527
 
2.5%
42073
 
2.1%
51934
 
1.9%
61853
 
1.9%
71543
 
1.5%
81478
 
1.5%
91368
 
1.4%
Other values (1932)72480
72.5%
(Missing)4719
 
4.7%
ValueCountFrequency (%)
03013
3.0%
13877
3.9%
23135
3.1%
32527
2.5%
42073
2.1%
51934
1.9%
61853
1.9%
71543
 
1.5%
81478
 
1.5%
91368
 
1.4%
ValueCountFrequency (%)
126611
< 0.1%
97941
< 0.1%
78741
< 0.1%
75261
< 0.1%
75141
< 0.1%
75031
< 0.1%
75001
< 0.1%
74761
< 0.1%
74631
< 0.1%
73381
< 0.1%

I_10
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct7
Distinct (%)< 0.1%
Missing44413
Missing (%)44.4%
Infinite0
Infinite (%)0.0%
Mean0.6201090183
Minimum0
Maximum6
Zeros26171
Zeros (%)26.2%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2022-08-03T20:40:07.385879image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q31
95-th percentile2
Maximum6
Range6
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.6770553589
Coefficient of variation (CV)1.091832789
Kurtosis2.494926214
Mean0.6201090183
Median Absolute Deviation (MAD)1
Skewness1.124920964
Sum34470
Variance0.458403945
MonotonicityNot monotonic
2022-08-03T20:40:07.412923image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
026171
26.2%
125277
25.3%
23409
 
3.4%
3580
 
0.6%
4119
 
0.1%
527
 
< 0.1%
64
 
< 0.1%
(Missing)44413
44.4%
ValueCountFrequency (%)
026171
26.2%
125277
25.3%
23409
 
3.4%
3580
 
0.6%
4119
 
0.1%
527
 
< 0.1%
64
 
< 0.1%
ValueCountFrequency (%)
64
 
< 0.1%
527
 
< 0.1%
4119
 
0.1%
3580
 
0.6%
23409
 
3.4%
125277
25.3%
026171
26.2%

I_11
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct86
Distinct (%)0.1%
Missing4719
Missing (%)4.7%
Infinite0
Infinite (%)0.0%
Mean2.400268679
Minimum0
Maximum104
Zeros25355
Zeros (%)25.4%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2022-08-03T20:40:07.450874image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile9
Maximum104
Range104
Interquartile range (IQR)2

Descriptive statistics

Standard deviation4.629926205
Coefficient of variation (CV)1.928919977
Kurtosis66.44548035
Mean2.400268679
Median Absolute Deviation (MAD)1
Skewness6.399148464
Sum228700
Variance21.43621635
MonotonicityNot monotonic
2022-08-03T20:40:07.494431image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
134032
34.0%
025355
25.4%
212758
 
12.8%
36774
 
6.8%
44045
 
4.0%
52557
 
2.6%
61909
 
1.9%
71459
 
1.5%
81053
 
1.1%
9832
 
0.8%
Other values (76)4507
 
4.5%
(Missing)4719
 
4.7%
ValueCountFrequency (%)
025355
25.4%
134032
34.0%
212758
 
12.8%
36774
 
6.8%
44045
 
4.0%
52557
 
2.6%
61909
 
1.9%
71459
 
1.5%
81053
 
1.1%
9832
 
0.8%
ValueCountFrequency (%)
1041
 
< 0.1%
1011
 
< 0.1%
973
< 0.1%
961
 
< 0.1%
911
 
< 0.1%
901
 
< 0.1%
851
 
< 0.1%
841
 
< 0.1%
793
< 0.1%
781
 
< 0.1%

I_12
Real number (ℝ≥0)

MISSING
SKEWED
ZEROS

Distinct71
Distinct (%)0.3%
Missing77180
Missing (%)77.2%
Infinite0
Infinite (%)0.0%
Mean0.9377738826
Minimum0
Maximum493
Zeros16957
Zeros (%)17.0%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2022-08-03T20:40:07.537995image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile4
Maximum493
Range493
Interquartile range (IQR)1

Descriptive statistics

Standard deviation5.32766819
Coefficient of variation (CV)5.68118636
Kurtosis3363.044434
Mean0.9377738826
Median Absolute Deviation (MAD)0
Skewness42.70809937
Sum21400
Variance28.38405037
MonotonicityNot monotonic
2022-08-03T20:40:07.581330image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
016957
 
17.0%
13274
 
3.3%
2882
 
0.9%
3436
 
0.4%
4297
 
0.3%
5137
 
0.1%
6126
 
0.1%
7106
 
0.1%
893
 
0.1%
1063
 
0.1%
Other values (61)449
 
0.4%
(Missing)77180
77.2%
ValueCountFrequency (%)
016957
17.0%
13274
 
3.3%
2882
 
0.9%
3436
 
0.4%
4297
 
0.3%
5137
 
0.1%
6126
 
0.1%
7106
 
0.1%
893
 
0.1%
951
 
0.1%
ValueCountFrequency (%)
4931
< 0.1%
1781
< 0.1%
1591
< 0.1%
1441
< 0.1%
1301
< 0.1%
1231
< 0.1%
981
< 0.1%
971
< 0.1%
872
< 0.1%
841
< 0.1%

I_13
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
SKEWED
ZEROS

Distinct273
Distinct (%)0.3%
Missing19534
Missing (%)19.5%
Infinite0
Infinite (%)0.0%
Mean11.60763552
Minimum0
Maximum6558
Zeros5200
Zeros (%)5.2%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2022-08-03T20:40:07.624489image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median5
Q313
95-th percentile39
Maximum6558
Range6558
Interquartile range (IQR)11

Descriptive statistics

Standard deviation52.04455566
Coefficient of variation (CV)4.483648333
Kurtosis5899.552246
Mean11.60763552
Median Absolute Deviation (MAD)4
Skewness67.00427246
Sum934020
Variance2708.635742
MonotonicityNot monotonic
2022-08-03T20:40:07.666964image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
111443
11.4%
28730
 
8.7%
36618
 
6.6%
45355
 
5.4%
05200
 
5.2%
54374
 
4.4%
63767
 
3.8%
73023
 
3.0%
82741
 
2.7%
92283
 
2.3%
Other values (263)26932
26.9%
(Missing)19534
19.5%
ValueCountFrequency (%)
05200
5.2%
111443
11.4%
28730
8.7%
36618
6.6%
45355
5.4%
54374
 
4.4%
63767
 
3.8%
73023
 
3.0%
82741
 
2.7%
92283
 
2.3%
ValueCountFrequency (%)
65581
< 0.1%
43311
< 0.1%
42621
< 0.1%
40161
< 0.1%
34691
< 0.1%
32921
< 0.1%
31691
< 0.1%
31421
< 0.1%
29621
< 0.1%
24081
< 0.1%

C_1
Categorical

HIGH CARDINALITY

Distinct541
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size781.4 KiB
05db9164
50197 
68fd1e64
16676 
5a9ed9b0
8329 
8cf07265
 
4887
be589b51
 
3336
Other values (536)
16575 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters800000
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique273 ?
Unique (%)0.3%

Sample

1st row68fd1e64
2nd row68fd1e64
3rd row287e684f
4th row68fd1e64
5th row8cf07265

Common Values

ValueCountFrequency (%)
05db916450197
50.2%
68fd1e6416676
 
16.7%
5a9ed9b08329
 
8.3%
8cf072654887
 
4.9%
be589b513336
 
3.3%
5bfa8ab52364
 
2.4%
875523971806
 
1.8%
f473b8dc1339
 
1.3%
39af26071151
 
1.2%
ae82ea21873
 
0.9%
Other values (531)9042
 
9.0%

Length

2022-08-03T20:40:07.706092image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
05db916450197
50.2%
68fd1e6416676
 
16.7%
5a9ed9b08329
 
8.3%
8cf072654887
 
4.9%
be589b513336
 
3.3%
5bfa8ab52364
 
2.4%
875523971806
 
1.8%
f473b8dc1339
 
1.3%
39af26071151
 
1.2%
ae82ea21873
 
0.9%
Other values (531)9042
 
9.0%

Most occurring characters

ValueCountFrequency (%)
694820
11.9%
584025
10.5%
d80255
10.0%
977775
9.7%
b75337
9.4%
175191
9.4%
474949
9.4%
067999
8.5%
835475
 
4.4%
e34261
 
4.3%
Other values (6)99913
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number546811
68.4%
Lowercase Letter253189
31.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
694820
17.3%
584025
15.4%
977775
14.2%
175191
13.8%
474949
13.7%
067999
12.4%
835475
 
6.5%
714846
 
2.7%
213073
 
2.4%
38658
 
1.6%
Lowercase Letter
ValueCountFrequency (%)
d80255
31.7%
b75337
29.8%
e34261
13.5%
f30731
 
12.1%
a21488
 
8.5%
c11117
 
4.4%

Most occurring scripts

ValueCountFrequency (%)
Common546811
68.4%
Latin253189
31.6%

Most frequent character per script

Common
ValueCountFrequency (%)
694820
17.3%
584025
15.4%
977775
14.2%
175191
13.8%
474949
13.7%
067999
12.4%
835475
 
6.5%
714846
 
2.7%
213073
 
2.4%
38658
 
1.6%
Latin
ValueCountFrequency (%)
d80255
31.7%
b75337
29.8%
e34261
13.5%
f30731
 
12.1%
a21488
 
8.5%
c11117
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII800000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
694820
11.9%
584025
10.5%
d80255
10.0%
977775
9.7%
b75337
9.4%
175191
9.4%
474949
9.4%
067999
8.5%
835475
 
4.4%
e34261
 
4.3%
Other values (6)99913
12.5%

C_2
Categorical

HIGH CARDINALITY

Distinct497
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size781.4 KiB
38a947a1
12895 
09e68b86
 
6826
80e26c9b
 
4043
38d50e09
 
3724
287130e0
 
3269
Other values (492)
69243 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters800000
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19 ?
Unique (%)< 0.1%

Sample

1st row80e26c9b
2nd rowf0cf0024
3rd row0a519c5c
4th row2c16a946
5th rowae46a29d

Common Values

ValueCountFrequency (%)
38a947a112895
 
12.9%
09e68b866826
 
6.8%
80e26c9b4043
 
4.0%
38d50e093724
 
3.7%
287130e03269
 
3.3%
4f25e98b3268
 
3.3%
1cfdf7142903
 
2.9%
207b2d812366
 
2.4%
08d6d8991988
 
2.0%
d833535f1892
 
1.9%
Other values (487)56826
56.8%

Length

2022-08-03T20:40:07.737100image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
38a947a112895
 
12.9%
09e68b866826
 
6.8%
80e26c9b4043
 
4.0%
38d50e093724
 
3.7%
287130e03269
 
3.3%
4f25e98b3268
 
3.3%
1cfdf7142903
 
2.9%
207b2d812366
 
2.4%
08d6d8991988
 
2.0%
d833535f1892
 
1.9%
Other values (487)56826
56.8%

Most occurring characters

ValueCountFrequency (%)
884438
 
10.6%
971673
 
9.0%
a58158
 
7.3%
057543
 
7.2%
e56139
 
7.0%
751858
 
6.5%
451100
 
6.4%
346718
 
5.8%
646464
 
5.8%
143176
 
5.4%
Other values (6)232733
29.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number528698
66.1%
Lowercase Letter271302
33.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
884438
16.0%
971673
13.6%
057543
10.9%
751858
9.8%
451100
9.7%
346718
8.8%
646464
8.8%
143176
8.2%
241300
7.8%
534428
6.5%
Lowercase Letter
ValueCountFrequency (%)
a58158
21.4%
e56139
20.7%
d43089
15.9%
f38785
14.3%
b38140
14.1%
c36991
13.6%

Most occurring scripts

ValueCountFrequency (%)
Common528698
66.1%
Latin271302
33.9%

Most frequent character per script

Common
ValueCountFrequency (%)
884438
16.0%
971673
13.6%
057543
10.9%
751858
9.8%
451100
9.7%
346718
8.8%
646464
8.8%
143176
8.2%
241300
7.8%
534428
6.5%
Latin
ValueCountFrequency (%)
a58158
21.4%
e56139
20.7%
d43089
15.9%
f38785
14.3%
b38140
14.1%
c36991
13.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII800000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
884438
 
10.6%
971673
 
9.0%
a58158
 
7.3%
057543
 
7.2%
e56139
 
7.0%
751858
 
6.5%
451100
 
6.4%
346718
 
5.8%
646464
 
5.8%
143176
 
5.4%
Other values (6)232733
29.1%

C_3
Categorical

HIGH CARDINALITY
MISSING

Distinct43869
Distinct (%)45.7%
Missing3935
Missing (%)3.9%
Memory size781.4 KiB
d032c263
 
3703
b00d1501
 
1900
02cf9876
 
1707
aa8c1539
 
1579
77f2f2e5
 
1571
Other values (43864)
85605 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters768520
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique37861 ?
Unique (%)39.4%

Sample

1st rowfb936136
2nd row6f67f7e5
3rd row02cf9876
4th rowa9a87e68
5th rowc81688bb

Common Values

ValueCountFrequency (%)
d032c2633703
 
3.7%
b00d15011900
 
1.9%
02cf98761707
 
1.7%
aa8c15391579
 
1.6%
77f2f2e51571
 
1.6%
74e1a23a1058
 
1.1%
9143c832972
 
1.0%
2cbec47f900
 
0.9%
ad4b77ff855
 
0.9%
4470baf4725
 
0.7%
Other values (43859)81095
81.1%
(Missing)3935
 
3.9%

Length

2022-08-03T20:40:07.768188image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
d032c2633703
 
3.9%
b00d15011900
 
2.0%
02cf98761707
 
1.8%
aa8c15391579
 
1.6%
77f2f2e51571
 
1.6%
74e1a23a1058
 
1.1%
9143c832972
 
1.0%
2cbec47f900
 
0.9%
ad4b77ff855
 
0.9%
4470baf4725
 
0.8%
Other values (43859)81095
84.4%

Most occurring characters

ValueCountFrequency (%)
257549
 
7.5%
053357
 
6.9%
352975
 
6.9%
c50849
 
6.6%
d48831
 
6.4%
748831
 
6.4%
148162
 
6.3%
f47448
 
6.2%
446628
 
6.1%
a45781
 
6.0%
Other values (6)268109
34.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number487434
63.4%
Lowercase Letter281086
36.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
257549
11.8%
053357
10.9%
352975
10.9%
748831
10.0%
148162
9.9%
446628
9.6%
945335
9.3%
645072
9.2%
544892
9.2%
844633
9.2%
Lowercase Letter
ValueCountFrequency (%)
c50849
18.1%
d48831
17.4%
f47448
16.9%
a45781
16.3%
b44612
15.9%
e43565
15.5%

Most occurring scripts

ValueCountFrequency (%)
Common487434
63.4%
Latin281086
36.6%

Most frequent character per script

Common
ValueCountFrequency (%)
257549
11.8%
053357
10.9%
352975
10.9%
748831
10.0%
148162
9.9%
446628
9.6%
945335
9.3%
645072
9.2%
544892
9.2%
844633
9.2%
Latin
ValueCountFrequency (%)
c50849
18.1%
d48831
17.4%
f47448
16.9%
a45781
16.3%
b44612
15.9%
e43565
15.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII768520
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
257549
 
7.5%
053357
 
6.9%
352975
 
6.9%
c50849
 
6.6%
d48831
 
6.4%
748831
 
6.4%
148162
 
6.3%
f47448
 
6.2%
446628
 
6.1%
a45781
 
6.0%
Other values (6)268109
34.9%

C_4
Categorical

HIGH CARDINALITY
MISSING

Distinct25183
Distinct (%)26.2%
Missing3935
Missing (%)3.9%
Memory size781.4 KiB
c18be181
 
5410
d16679b9
 
4380
85dd697c
 
3235
13508380
 
2330
f922efad
 
1487
Other values (25178)
79223 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters768520
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18359 ?
Unique (%)19.1%

Sample

1st row7b4723c4
2nd row41274cd7
3rd rowc18be181
4th row2e17d6f6
5th rowf922efad

Common Values

ValueCountFrequency (%)
c18be1815410
 
5.4%
d16679b94380
 
4.4%
85dd697c3235
 
3.2%
135083802330
 
2.3%
f922efad1487
 
1.5%
9a6888fb1058
 
1.1%
f56b7dd5972
 
1.0%
3e2bfbda908
 
0.9%
29998ed1895
 
0.9%
6a14f9b9854
 
0.9%
Other values (25173)74536
74.5%
(Missing)3935
 
3.9%

Length

2022-08-03T20:40:07.799182image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
c18be1815410
 
5.6%
d16679b94380
 
4.6%
85dd697c3235
 
3.4%
135083802330
 
2.4%
f922efad1487
 
1.5%
9a6888fb1058
 
1.1%
f56b7dd5972
 
1.0%
3e2bfbda908
 
0.9%
29998ed1895
 
0.9%
6a14f9b9854
 
0.9%
Other values (25173)74536
77.6%

Most occurring characters

ValueCountFrequency (%)
162635
 
8.2%
858996
 
7.7%
958008
 
7.5%
653526
 
7.0%
b52241
 
6.8%
d51879
 
6.8%
c49561
 
6.4%
545398
 
5.9%
f45325
 
5.9%
e44854
 
5.8%
Other values (6)246097
32.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number483242
62.9%
Lowercase Letter285278
37.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
162635
13.0%
858996
12.2%
958008
12.0%
653526
11.1%
545398
9.4%
042872
8.9%
742396
8.8%
241003
8.5%
339452
8.2%
438956
8.1%
Lowercase Letter
ValueCountFrequency (%)
b52241
18.3%
d51879
18.2%
c49561
17.4%
f45325
15.9%
e44854
15.7%
a41418
14.5%

Most occurring scripts

ValueCountFrequency (%)
Common483242
62.9%
Latin285278
37.1%

Most frequent character per script

Common
ValueCountFrequency (%)
162635
13.0%
858996
12.2%
958008
12.0%
653526
11.1%
545398
9.4%
042872
8.9%
742396
8.8%
241003
8.5%
339452
8.2%
438956
8.1%
Latin
ValueCountFrequency (%)
b52241
18.3%
d51879
18.2%
c49561
17.4%
f45325
15.9%
e44854
15.7%
a41418
14.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII768520
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
162635
 
8.2%
858996
 
7.7%
958008
 
7.5%
653526
 
7.0%
b52241
 
6.8%
d51879
 
6.8%
c49561
 
6.4%
545398
 
5.9%
f45325
 
5.9%
e44854
 
5.8%
Other values (6)246097
32.0%

C_5
Categorical

HIGH CARDINALITY

Distinct145
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size781.4 KiB
25c83c98
67056 
4cf72387
15783 
43b19349
 
6228
384874ce
 
3224
30903e74
 
1990
Other values (140)
 
5719

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters800000
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique52 ?
Unique (%)0.1%

Sample

1st row25c83c98
2nd row25c83c98
3rd row25c83c98
4th row25c83c98
5th row25c83c98

Common Values

ValueCountFrequency (%)
25c83c9867056
67.1%
4cf7238715783
 
15.8%
43b193496228
 
6.2%
384874ce3224
 
3.2%
30903e741990
 
2.0%
0942e0a71313
 
1.3%
f281d2a7874
 
0.9%
b0530c50560
 
0.6%
b2241560483
 
0.5%
f3474129374
 
0.4%
Other values (135)2115
 
2.1%

Length

2022-08-03T20:40:07.829697image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
25c83c9867056
67.1%
4cf7238715783
 
15.8%
43b193496228
 
6.2%
384874ce3224
 
3.2%
30903e741990
 
2.0%
0942e0a71313
 
1.3%
f281d2a7874
 
0.9%
b0530c50560
 
0.6%
b2241560483
 
0.5%
f3474129374
 
0.4%
Other values (135)2115
 
2.1%

Most occurring characters

ValueCountFrequency (%)
8158526
19.8%
c154401
19.3%
3104177
13.0%
288258
11.0%
984178
10.5%
569765
8.7%
741153
 
5.1%
439997
 
5.0%
f18195
 
2.3%
09890
 
1.2%
Other values (6)31460
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number605918
75.7%
Lowercase Letter194082
 
24.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
8158526
26.2%
3104177
17.2%
288258
14.6%
984178
13.9%
569765
11.5%
741153
 
6.8%
439997
 
6.6%
09890
 
1.6%
18769
 
1.4%
61205
 
0.2%
Lowercase Letter
ValueCountFrequency (%)
c154401
79.6%
f18195
 
9.4%
b8359
 
4.3%
e8032
 
4.1%
a3202
 
1.6%
d1893
 
1.0%

Most occurring scripts

ValueCountFrequency (%)
Common605918
75.7%
Latin194082
 
24.3%

Most frequent character per script

Common
ValueCountFrequency (%)
8158526
26.2%
3104177
17.2%
288258
14.6%
984178
13.9%
569765
11.5%
741153
 
6.8%
439997
 
6.6%
09890
 
1.6%
18769
 
1.4%
61205
 
0.2%
Latin
ValueCountFrequency (%)
c154401
79.6%
f18195
 
9.4%
b8359
 
4.3%
e8032
 
4.1%
a3202
 
1.6%
d1893
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII800000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8158526
19.8%
c154401
19.3%
3104177
13.0%
288258
11.0%
984178
10.5%
569765
8.7%
741153
 
5.1%
439997
 
5.0%
f18195
 
2.3%
09890
 
1.2%
Other values (6)31460
 
3.9%

C_6
Categorical

HIGH CORRELATION
MISSING

Distinct11
Distinct (%)< 0.1%
Missing13708
Missing (%)13.7%
Memory size781.4 KiB
7e0ccccf
46438 
fbad5c96
19331 
fe6b92e5
11495 
13718bbd
 
3802
6f6d9be8
 
3460
Other values (6)
 
1766

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters690336
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row7e0ccccf
2nd rowfe6b92e5
3rd row7e0ccccf
4th rowfe6b92e5
5th row13718bbd

Common Values

ValueCountFrequency (%)
7e0ccccf46438
46.4%
fbad5c9619331
19.3%
fe6b92e511495
 
11.5%
13718bbd3802
 
3.8%
6f6d9be83460
 
3.5%
3bf701e71728
 
1.7%
e352042216
 
< 0.1%
f1f2de2d9
 
< 0.1%
c05778d56
 
< 0.1%
c76aecf65
 
< 0.1%
(Missing)13708
 
13.7%

Length

2022-08-03T20:40:07.859911image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
7e0ccccf46438
53.8%
fbad5c9619331
22.4%
fe6b92e511495
 
13.3%
13718bbd3802
 
4.4%
6f6d9be83460
 
4.0%
3bf701e71728
 
2.0%
e352042216
 
< 0.1%
f1f2de2d9
 
< 0.1%
c05778d56
 
< 0.1%
c76aecf65
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
c205101
29.7%
f82475
11.9%
e74648
 
10.8%
753713
 
7.8%
048188
 
7.0%
b43620
 
6.3%
637758
 
5.5%
934286
 
5.0%
530854
 
4.5%
d26619
 
3.9%
Other values (6)53074
 
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter451799
65.4%
Decimal Number238537
34.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
753713
22.5%
048188
20.2%
637758
15.8%
934286
14.4%
530854
12.9%
211563
 
4.8%
19341
 
3.9%
87268
 
3.0%
35548
 
2.3%
418
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
c205101
45.4%
f82475
18.3%
e74648
 
16.5%
b43620
 
9.7%
d26619
 
5.9%
a19336
 
4.3%

Most occurring scripts

ValueCountFrequency (%)
Latin451799
65.4%
Common238537
34.6%

Most frequent character per script

Common
ValueCountFrequency (%)
753713
22.5%
048188
20.2%
637758
15.8%
934286
14.4%
530854
12.9%
211563
 
4.8%
19341
 
3.9%
87268
 
3.0%
35548
 
2.3%
418
 
< 0.1%
Latin
ValueCountFrequency (%)
c205101
45.4%
f82475
18.3%
e74648
 
16.5%
b43620
 
9.7%
d26619
 
5.9%
a19336
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII690336
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c205101
29.7%
f82475
11.9%
e74648
 
10.8%
753713
 
7.8%
048188
 
7.0%
b43620
 
6.3%
637758
 
5.5%
934286
 
5.0%
530854
 
4.5%
d26619
 
3.9%
Other values (6)53074
 
7.7%

C_7
Categorical

HIGH CARDINALITY

Distinct7623
Distinct (%)7.6%
Missing0
Missing (%)0.0%
Memory size781.4 KiB
3f4ec687
 
957
49b74ebc
 
902
7195046d
 
776
970f01b2
 
772
88002ee1
 
639
Other values (7618)
95954 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters800000
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1737 ?
Unique (%)1.7%

Sample

1st rowde7995b8
2nd row922afcc0
3rd rowc78204a1
4th row2e8a689b
5th rowad9fa255

Common Values

ValueCountFrequency (%)
3f4ec687957
 
1.0%
49b74ebc902
 
0.9%
7195046d776
 
0.8%
970f01b2772
 
0.8%
88002ee1639
 
0.6%
9b98e9fc633
 
0.6%
38eb9cf4585
 
0.6%
d2d741ca581
 
0.6%
81bb0302541
 
0.5%
dc7659bd534
 
0.5%
Other values (7613)93080
93.1%

Length

2022-08-03T20:40:07.890133image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
3f4ec687957
 
1.0%
49b74ebc902
 
0.9%
7195046d776
 
0.8%
970f01b2772
 
0.8%
88002ee1639
 
0.6%
9b98e9fc633
 
0.6%
38eb9cf4585
 
0.6%
d2d741ca581
 
0.6%
81bb0302541
 
0.5%
dc7659bd534
 
0.5%
Other values (7613)93080
93.1%

Most occurring characters

ValueCountFrequency (%)
d52044
 
6.5%
951986
 
6.5%
a51863
 
6.5%
451130
 
6.4%
651080
 
6.4%
b50867
 
6.4%
850631
 
6.3%
f50581
 
6.3%
c49621
 
6.2%
749198
 
6.1%
Other values (6)290999
36.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number496381
62.0%
Lowercase Letter303619
38.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
951986
10.5%
451130
10.3%
651080
10.3%
850631
10.2%
749198
9.9%
049195
9.9%
148961
9.9%
548909
9.9%
347675
9.6%
247616
9.6%
Lowercase Letter
ValueCountFrequency (%)
d52044
17.1%
a51863
17.1%
b50867
16.8%
f50581
16.7%
c49621
16.3%
e48643
16.0%

Most occurring scripts

ValueCountFrequency (%)
Common496381
62.0%
Latin303619
38.0%

Most frequent character per script

Common
ValueCountFrequency (%)
951986
10.5%
451130
10.3%
651080
10.3%
850631
10.2%
749198
9.9%
049195
9.9%
148961
9.9%
548909
9.9%
347675
9.6%
247616
9.6%
Latin
ValueCountFrequency (%)
d52044
17.1%
a51863
17.1%
b50867
16.8%
f50581
16.7%
c49621
16.3%
e48643
16.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII800000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
d52044
 
6.5%
951986
 
6.5%
a51863
 
6.5%
451130
 
6.4%
651080
 
6.4%
b50867
 
6.4%
850631
 
6.3%
f50581
 
6.3%
c49621
 
6.2%
749198
 
6.1%
Other values (6)290999
36.4%

C_8
Categorical

HIGH CARDINALITY

Distinct257
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size781.4 KiB
0b153874
59264 
5b392875
16689 
1f89b562
7485 
37e4aa92
 
4234
062b5529
 
2638
Other values (252)
9690 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters800000
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique112 ?
Unique (%)0.1%

Sample

1st row1f89b562
2nd row0b153874
3rd row0b153874
4th row0b153874
5th row0b153874

Common Values

ValueCountFrequency (%)
0b15387459264
59.3%
5b39287516689
 
16.7%
1f89b5627485
 
7.5%
37e4aa924234
 
4.2%
062b55292638
 
2.6%
51d76abe1774
 
1.8%
c8ddd4941258
 
1.3%
64523cfa979
 
1.0%
6c41e35e709
 
0.7%
985e3fcb628
 
0.6%
Other values (247)4342
 
4.3%

Length

2022-08-03T20:40:07.920728image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0b15387459264
59.3%
5b39287516689
 
16.7%
1f89b5627485
 
7.5%
37e4aa924234
 
4.2%
062b55292638
 
2.6%
51d76abe1774
 
1.8%
c8ddd4941258
 
1.3%
64523cfa979
 
1.0%
6c41e35e709
 
0.7%
985e3fcb628
 
0.6%
Other values (247)4342
 
4.3%

Most occurring characters

ValueCountFrequency (%)
5112287
14.0%
b89910
11.2%
887228
10.9%
384877
10.6%
783619
10.5%
171099
8.9%
470397
8.8%
063647
8.0%
237957
 
4.7%
935149
 
4.4%
Other values (6)63830
8.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number663362
82.9%
Lowercase Letter136638
 
17.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5112287
16.9%
887228
13.1%
384877
12.8%
783619
12.6%
171099
10.7%
470397
10.6%
063647
9.6%
237957
 
5.7%
935149
 
5.3%
617102
 
2.6%
Lowercase Letter
ValueCountFrequency (%)
b89910
65.8%
a13167
 
9.6%
f11994
 
8.8%
e9246
 
6.8%
d6957
 
5.1%
c5364
 
3.9%

Most occurring scripts

ValueCountFrequency (%)
Common663362
82.9%
Latin136638
 
17.1%

Most frequent character per script

Common
ValueCountFrequency (%)
5112287
16.9%
887228
13.1%
384877
12.8%
783619
12.6%
171099
10.7%
470397
10.6%
063647
9.6%
237957
 
5.7%
935149
 
5.3%
617102
 
2.6%
Latin
ValueCountFrequency (%)
b89910
65.8%
a13167
 
9.6%
f11994
 
8.8%
e9246
 
6.8%
d6957
 
5.1%
c5364
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII800000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5112287
14.0%
b89910
11.2%
887228
10.9%
384877
10.6%
783619
10.5%
171099
8.9%
470397
8.8%
063647
8.0%
237957
 
4.7%
935149
 
4.4%
Other values (6)63830
8.0%

C_9
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.4 KiB
a73ee510
88523 
7cc72ec2
11445 
a18233ea
 
32

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters800000
Distinct characters10
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowa73ee510
2nd rowa73ee510
3rd rowa73ee510
4th rowa73ee510
5th rowa73ee510

Common Values

ValueCountFrequency (%)
a73ee51088523
88.5%
7cc72ec211445
 
11.4%
a18233ea32
 
< 0.1%

Length

2022-08-03T20:40:07.951148image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-03T20:40:07.983132image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
a73ee51088523
88.5%
7cc72ec211445
 
11.4%
a18233ea32
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e188523
23.6%
7111413
13.9%
a88587
11.1%
388587
11.1%
188555
11.1%
588523
11.1%
088523
11.1%
c34335
 
4.3%
222922
 
2.9%
832
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number488555
61.1%
Lowercase Letter311445
38.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7111413
22.8%
388587
18.1%
188555
18.1%
588523
18.1%
088523
18.1%
222922
 
4.7%
832
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
e188523
60.5%
a88587
28.4%
c34335
 
11.0%

Most occurring scripts

ValueCountFrequency (%)
Common488555
61.1%
Latin311445
38.9%

Most frequent character per script

Common
ValueCountFrequency (%)
7111413
22.8%
388587
18.1%
188555
18.1%
588523
18.1%
088523
18.1%
222922
 
4.7%
832
 
< 0.1%
Latin
ValueCountFrequency (%)
e188523
60.5%
a88587
28.4%
c34335
 
11.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII800000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e188523
23.6%
7111413
13.9%
a88587
11.1%
388587
11.1%
188555
11.1%
588523
11.1%
088523
11.1%
c34335
 
4.3%
222922
 
2.9%
832
 
< 0.1%

C_10
Categorical

HIGH CARDINALITY

Distinct10997
Distinct (%)11.0%
Missing0
Missing (%)0.0%
Memory size781.4 KiB
3b08e48b
26595 
fbbf2c95
 
887
efea433b
 
824
6c47047a
 
764
0e9ead52
 
749
Other values (10992)
70181 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters800000
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5167 ?
Unique (%)5.2%

Sample

1st rowa8cd5504
2nd row2b53e5fb
3rd row3b08e48b
4th rowefea433b
5th row5282c137

Common Values

ValueCountFrequency (%)
3b08e48b26595
 
26.6%
fbbf2c95887
 
0.9%
efea433b824
 
0.8%
6c47047a764
 
0.8%
0e9ead52749
 
0.7%
fa7d0797705
 
0.7%
5ba575e7565
 
0.6%
5162b19c483
 
0.5%
451bd4e4446
 
0.4%
7f79890b426
 
0.4%
Other values (10987)67556
67.6%

Length

2022-08-03T20:40:08.012214image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
3b08e48b26595
 
26.6%
fbbf2c95887
 
0.9%
efea433b824
 
0.8%
6c47047a764
 
0.8%
0e9ead52749
 
0.7%
fa7d0797705
 
0.7%
5ba575e7565
 
0.6%
5162b19c483
 
0.5%
451bd4e4446
 
0.4%
7f79890b426
 
0.4%
Other values (10987)67556
67.6%

Most occurring characters

ValueCountFrequency (%)
b89877
11.2%
886977
10.9%
e65706
 
8.2%
065014
 
8.1%
464886
 
8.1%
360393
 
7.5%
741407
 
5.2%
a38661
 
4.8%
f37565
 
4.7%
237522
 
4.7%
Other values (6)211992
26.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number496628
62.1%
Lowercase Letter303372
37.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
886977
17.5%
065014
13.1%
464886
13.1%
360393
12.2%
741407
8.3%
237522
7.6%
536593
7.4%
935219
7.1%
634974
7.0%
133643
 
6.8%
Lowercase Letter
ValueCountFrequency (%)
b89877
29.6%
e65706
21.7%
a38661
12.7%
f37565
12.4%
d36408
12.0%
c35155
 
11.6%

Most occurring scripts

ValueCountFrequency (%)
Common496628
62.1%
Latin303372
37.9%

Most frequent character per script

Common
ValueCountFrequency (%)
886977
17.5%
065014
13.1%
464886
13.1%
360393
12.2%
741407
8.3%
237522
7.6%
536593
7.4%
935219
7.1%
634974
7.0%
133643
 
6.8%
Latin
ValueCountFrequency (%)
b89877
29.6%
e65706
21.7%
a38661
12.7%
f37565
12.4%
d36408
12.0%
c35155
 
11.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII800000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
b89877
11.2%
886977
10.9%
e65706
 
8.2%
065014
 
8.1%
464886
 
8.1%
360393
 
7.5%
741407
 
5.2%
a38661
 
4.8%
f37565
 
4.7%
237522
 
4.7%
Other values (6)211992
26.5%

C_11
Categorical

HIGH CARDINALITY

Distinct3799
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size781.4 KiB
c4adf918
 
2118
7f8ffe57
 
1539
e51ddf94
 
1196
4d8549da
 
1157
f25fe7e9
 
932
Other values (3794)
93058 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters800000
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique581 ?
Unique (%)0.6%

Sample

1st rowb2cb9c98
2nd row4f1b46f3
3rd row5f5e6091
4th rowe51ddf94
5th rowe5d8af57

Common Values

ValueCountFrequency (%)
c4adf9182118
 
2.1%
7f8ffe571539
 
1.5%
e51ddf941196
 
1.2%
4d8549da1157
 
1.2%
f25fe7e9932
 
0.9%
755e4a50822
 
0.8%
36bccca0771
 
0.8%
b7094596761
 
0.8%
5874c9c9745
 
0.7%
a7b606c4721
 
0.7%
Other values (3789)89238
89.2%

Length

2022-08-03T20:40:08.042930image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
c4adf9182118
 
2.1%
7f8ffe571539
 
1.5%
e51ddf941196
 
1.2%
4d8549da1157
 
1.2%
f25fe7e9932
 
0.9%
755e4a50822
 
0.8%
36bccca0771
 
0.8%
b7094596761
 
0.8%
5874c9c9745
 
0.7%
a7b606c4721
 
0.7%
Other values (3789)89238
89.2%

Most occurring characters

ValueCountFrequency (%)
757200
 
7.1%
955664
 
7.0%
a53144
 
6.6%
553078
 
6.6%
d52846
 
6.6%
452229
 
6.5%
e50154
 
6.3%
650023
 
6.3%
f49772
 
6.2%
848655
 
6.1%
Other values (6)277235
34.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number499058
62.4%
Lowercase Letter300942
37.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
757200
11.5%
955664
11.2%
553078
10.6%
452229
10.5%
650023
10.0%
848655
9.7%
048445
9.7%
147536
9.5%
243171
8.7%
343057
8.6%
Lowercase Letter
ValueCountFrequency (%)
a53144
17.7%
d52846
17.6%
e50154
16.7%
f49772
16.5%
c47888
15.9%
b47138
15.7%

Most occurring scripts

ValueCountFrequency (%)
Common499058
62.4%
Latin300942
37.6%

Most frequent character per script

Common
ValueCountFrequency (%)
757200
11.5%
955664
11.2%
553078
10.6%
452229
10.5%
650023
10.0%
848655
9.7%
048445
9.7%
147536
9.5%
243171
8.7%
343057
8.6%
Latin
ValueCountFrequency (%)
a53144
17.7%
d52846
17.6%
e50154
16.7%
f49772
16.5%
c47888
15.9%
b47138
15.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII800000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
757200
 
7.1%
955664
 
7.0%
a53144
 
6.6%
553078
 
6.6%
d52846
 
6.6%
452229
 
6.5%
e50154
 
6.3%
650023
 
6.3%
f49772
 
6.2%
848655
 
6.1%
Other values (6)277235
34.7%

C_12
Categorical

HIGH CARDINALITY
MISSING

Distinct41311
Distinct (%)43.0%
Missing3935
Missing (%)3.9%
Memory size781.4 KiB
dfbb09fb
 
3703
e0d76380
 
1900
8fe001f4
 
1707
d8c29807
 
1579
9f32b866
 
1571
Other values (41306)
85605 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters768520
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35262 ?
Unique (%)36.7%

Sample

1st row37c9c164
2nd row623049e6
3rd row8fe001f4
4th rowa30567ca
5th row66a76a26

Common Values

ValueCountFrequency (%)
dfbb09fb3703
 
3.7%
e0d763801900
 
1.9%
8fe001f41707
 
1.7%
d8c298071579
 
1.6%
9f32b8661571
 
1.6%
fb8fab621058
 
1.1%
ae1bb660972
 
1.0%
21a23bfe900
 
0.9%
6aaba33c895
 
0.9%
a2f4e8b5855
 
0.9%
Other values (41301)80925
80.9%
(Missing)3935
 
3.9%

Length

2022-08-03T20:40:08.074068image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
dfbb09fb3703
 
3.9%
e0d763801900
 
2.0%
8fe001f41707
 
1.8%
d8c298071579
 
1.6%
9f32b8661571
 
1.6%
fb8fab621058
 
1.1%
ae1bb660972
 
1.0%
21a23bfe900
 
0.9%
6aaba33c895
 
0.9%
a2f4e8b5855
 
0.9%
Other values (41301)80925
84.2%

Most occurring characters

ValueCountFrequency (%)
b60268
 
7.8%
f54860
 
7.1%
652359
 
6.8%
951645
 
6.7%
050579
 
6.6%
849321
 
6.4%
248777
 
6.3%
348619
 
6.3%
d46607
 
6.1%
e45986
 
6.0%
Other values (6)259499
33.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number473876
61.7%
Lowercase Letter294644
38.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
652359
11.0%
951645
10.9%
050579
10.7%
849321
10.4%
248777
10.3%
348619
10.3%
544680
9.4%
744023
9.3%
142111
8.9%
441762
8.8%
Lowercase Letter
ValueCountFrequency (%)
b60268
20.5%
f54860
18.6%
d46607
15.8%
e45986
15.6%
a45697
15.5%
c41226
14.0%

Most occurring scripts

ValueCountFrequency (%)
Common473876
61.7%
Latin294644
38.3%

Most frequent character per script

Common
ValueCountFrequency (%)
652359
11.0%
951645
10.9%
050579
10.7%
849321
10.4%
248777
10.3%
348619
10.3%
544680
9.4%
744023
9.3%
142111
8.9%
441762
8.8%
Latin
ValueCountFrequency (%)
b60268
20.5%
f54860
18.6%
d46607
15.8%
e45986
15.6%
a45697
15.5%
c41226
14.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII768520
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
b60268
 
7.8%
f54860
 
7.1%
652359
 
6.8%
951645
 
6.7%
050579
 
6.6%
849321
 
6.4%
248777
 
6.3%
348619
 
6.3%
d46607
 
6.1%
e45986
 
6.0%
Other values (6)259499
33.8%

C_13
Categorical

HIGH CARDINALITY

Distinct2796
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size781.4 KiB
85dbe138
 
2327
46f42a63
 
1669
3516f6e6
 
1327
51b97b8f
 
1157
80467802
 
1111
Other values (2791)
92409 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters800000
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique272 ?
Unique (%)0.3%

Sample

1st row2824a5f6
2nd rowd7020589
3rd rowaa655a2f
4th row3516f6e6
5th rowf06c53ac

Common Values

ValueCountFrequency (%)
85dbe1382327
 
2.3%
46f42a631669
 
1.7%
3516f6e61327
 
1.3%
51b97b8f1157
 
1.2%
804678021111
 
1.1%
6e5da64f1093
 
1.1%
ebd756bd1067
 
1.1%
740c210d1009
 
1.0%
dd183b4c932
 
0.9%
5978055e822
 
0.8%
Other values (2786)87486
87.5%

Length

2022-08-03T20:40:08.105111image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
85dbe1382327
 
2.3%
46f42a631669
 
1.7%
3516f6e61327
 
1.3%
51b97b8f1157
 
1.2%
804678021111
 
1.1%
6e5da64f1093
 
1.1%
ebd756bd1067
 
1.1%
740c210d1009
 
1.0%
dd183b4c932
 
0.9%
5978055e822
 
0.8%
Other values (2786)87486
87.5%

Most occurring characters

ValueCountFrequency (%)
b55463
 
6.9%
f52355
 
6.5%
852076
 
6.5%
551971
 
6.5%
451008
 
6.4%
e50926
 
6.4%
650899
 
6.4%
150188
 
6.3%
950026
 
6.3%
049930
 
6.2%
Other values (6)285158
35.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number501018
62.6%
Lowercase Letter298982
37.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
852076
10.4%
551971
10.4%
451008
10.2%
650899
10.2%
150188
10.0%
950026
10.0%
049930
10.0%
348531
9.7%
748376
9.7%
248013
9.6%
Lowercase Letter
ValueCountFrequency (%)
b55463
18.6%
f52355
17.5%
e50926
17.0%
d49672
16.6%
a45353
15.2%
c45213
15.1%

Most occurring scripts

ValueCountFrequency (%)
Common501018
62.6%
Latin298982
37.4%

Most frequent character per script

Common
ValueCountFrequency (%)
852076
10.4%
551971
10.4%
451008
10.2%
650899
10.2%
150188
10.0%
950026
10.0%
049930
10.0%
348531
9.7%
748376
9.7%
248013
9.6%
Latin
ValueCountFrequency (%)
b55463
18.6%
f52355
17.5%
e50926
17.0%
d49672
16.6%
a45353
15.2%
c45213
15.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII800000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
b55463
 
6.9%
f52355
 
6.5%
852076
 
6.5%
551971
 
6.5%
451008
 
6.4%
e50926
 
6.4%
650899
 
6.4%
150188
 
6.3%
950026
 
6.3%
049930
 
6.2%
Other values (6)285158
35.6%

C_14
Categorical

HIGH CORRELATION

Distinct26
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.4 KiB
07d13a8f
36315 
b28479f6
34577 
1adce6ef
15964 
64c94865
 
3553
cfef1c29
 
2268
Other values (21)
7323 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters800000
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row1adce6ef
2nd rowb28479f6
3rd row07d13a8f
4th row07d13a8f
5th row1adce6ef

Common Values

ValueCountFrequency (%)
07d13a8f36315
36.3%
b28479f634577
34.6%
1adce6ef15964
16.0%
64c948653553
 
3.6%
cfef1c292268
 
2.3%
8ceecbc81830
 
1.8%
051219e61717
 
1.7%
f862f2611260
 
1.3%
f7c1b33f617
 
0.6%
d2dfe871538
 
0.5%
Other values (16)1361
 
1.4%

Length

2022-08-03T20:40:08.135773image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
07d13a8f36315
36.3%
b28479f634577
34.6%
1adce6ef15964
16.0%
64c948653553
 
3.6%
cfef1c292268
 
2.3%
8ceecbc81830
 
1.8%
051219e61717
 
1.7%
f862f2611260
 
1.3%
f7c1b33f617
 
0.6%
d2dfe871538
 
0.5%
Other values (16)1361
 
1.4%

Most occurring characters

ValueCountFrequency (%)
f96074
12.0%
880491
10.1%
772747
 
9.1%
662717
 
7.8%
161634
 
7.7%
d54106
 
6.8%
a53005
 
6.6%
942691
 
5.3%
242619
 
5.3%
442097
 
5.3%
Other values (6)191819
24.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number487497
60.9%
Lowercase Letter312503
39.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
880491
16.5%
772747
14.9%
662717
12.9%
161634
12.6%
942691
8.8%
242619
8.7%
442097
8.6%
338542
7.9%
038459
7.9%
55500
 
1.1%
Lowercase Letter
ValueCountFrequency (%)
f96074
30.7%
d54106
17.3%
a53005
17.0%
e40781
13.0%
b37261
 
11.9%
c31276
 
10.0%

Most occurring scripts

ValueCountFrequency (%)
Common487497
60.9%
Latin312503
39.1%

Most frequent character per script

Common
ValueCountFrequency (%)
880491
16.5%
772747
14.9%
662717
12.9%
161634
12.6%
942691
8.8%
242619
8.7%
442097
8.6%
338542
7.9%
038459
7.9%
55500
 
1.1%
Latin
ValueCountFrequency (%)
f96074
30.7%
d54106
17.3%
a53005
17.0%
e40781
13.0%
b37261
 
11.9%
c31276
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII800000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
f96074
12.0%
880491
10.1%
772747
 
9.1%
662717
 
7.8%
161634
 
7.7%
d54106
 
6.8%
a53005
 
6.6%
942691
 
5.3%
242619
 
5.3%
442097
 
5.3%
Other values (6)191819
24.0%

C_15
Categorical

HIGH CARDINALITY

Distinct5238
Distinct (%)5.2%
Missing0
Missing (%)0.0%
Memory size781.4 KiB
36721ddc
 
2054
52baadf5
 
1526
42b3012c
 
976
10040656
 
972
dbc5e126
 
930
Other values (5233)
93542 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters800000
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1677 ?
Unique (%)1.7%

Sample

1st row8ba8b39a
2nd rowe6c5b5cd
3rd row6dc710ed
4th row18231224
5th row8ff4b403

Common Values

ValueCountFrequency (%)
36721ddc2054
 
2.1%
52baadf51526
 
1.5%
42b3012c976
 
1.0%
10040656972
 
1.0%
dbc5e126930
 
0.9%
d345b1a0881
 
0.9%
0f942372804
 
0.8%
9efd8b77777
 
0.8%
f3635baf755
 
0.8%
d2f03b75733
 
0.7%
Other values (5228)89592
89.6%

Length

2022-08-03T20:40:08.166352image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
36721ddc2054
 
2.1%
52baadf51526
 
1.5%
42b3012c976
 
1.0%
10040656972
 
1.0%
dbc5e126930
 
0.9%
d345b1a0881
 
0.9%
0f942372804
 
0.8%
9efd8b77777
 
0.8%
f3635baf755
 
0.8%
d2f03b75733
 
0.7%
Other values (5228)89592
89.6%

Most occurring characters

ValueCountFrequency (%)
262839
 
7.9%
d54182
 
6.8%
354068
 
6.8%
553672
 
6.7%
153093
 
6.6%
652012
 
6.5%
f51356
 
6.4%
a50992
 
6.4%
b50878
 
6.4%
048952
 
6.1%
Other values (6)267956
33.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number504163
63.0%
Lowercase Letter295837
37.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
262839
12.5%
354068
10.7%
553672
10.6%
153093
10.5%
652012
10.3%
048952
9.7%
747445
9.4%
844665
8.9%
944188
8.8%
443229
8.6%
Lowercase Letter
ValueCountFrequency (%)
d54182
18.3%
f51356
17.4%
a50992
17.2%
b50878
17.2%
c46334
15.7%
e42095
14.2%

Most occurring scripts

ValueCountFrequency (%)
Common504163
63.0%
Latin295837
37.0%

Most frequent character per script

Common
ValueCountFrequency (%)
262839
12.5%
354068
10.7%
553672
10.6%
153093
10.5%
652012
10.3%
048952
9.7%
747445
9.4%
844665
8.9%
944188
8.8%
443229
8.6%
Latin
ValueCountFrequency (%)
d54182
18.3%
f51356
17.4%
a50992
17.2%
b50878
17.2%
c46334
15.7%
e42095
14.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII800000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
262839
 
7.9%
d54182
 
6.8%
354068
 
6.8%
553672
 
6.7%
153093
 
6.6%
652012
 
6.5%
f51356
 
6.4%
a50992
 
6.4%
b50878
 
6.4%
048952
 
6.1%
Other values (6)267956
33.5%

C_16
Categorical

HIGH CARDINALITY
MISSING

Distinct34616
Distinct (%)36.0%
Missing3935
Missing (%)3.9%
Memory size781.4 KiB
84898b2a
 
3703
1203a270
 
1900
36103458
 
1707
31ca40b6
 
1587
c64d548f
 
1579
Other values (34611)
85589 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters768520
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28185 ?
Unique (%)29.3%

Sample

1st row891b62e7
2nd rowc92f3b61
3rd row36103458
4th row52b8680f
5th row01adbab4

Common Values

ValueCountFrequency (%)
84898b2a3703
 
3.7%
1203a2701900
 
1.9%
361034581707
 
1.7%
31ca40b61587
 
1.6%
c64d548f1579
 
1.6%
c6b1e1b21058
 
1.1%
bad5ee18972
 
1.0%
587267a3900
 
0.9%
b041b04a895
 
0.9%
89052618872
 
0.9%
Other values (34606)80892
80.9%
(Missing)3935
 
3.9%

Length

2022-08-03T20:40:08.197443image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
84898b2a3703
 
3.9%
1203a2701900
 
2.0%
361034581707
 
1.8%
31ca40b61587
 
1.7%
c64d548f1579
 
1.6%
c6b1e1b21058
 
1.1%
bad5ee18972
 
1.0%
587267a3900
 
0.9%
b041b04a895
 
0.9%
89052618872
 
0.9%
Other values (34606)80892
84.2%

Most occurring characters

ValueCountFrequency (%)
858032
 
7.6%
453885
 
7.0%
b53858
 
7.0%
a52592
 
6.8%
251523
 
6.7%
150861
 
6.6%
649539
 
6.4%
048381
 
6.3%
348213
 
6.3%
543818
 
5.7%
Other values (6)257818
33.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number489511
63.7%
Lowercase Letter279009
36.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
858032
11.9%
453885
11.0%
251523
10.5%
150861
10.4%
649539
10.1%
048381
9.9%
348213
9.8%
543818
9.0%
943003
8.8%
742256
8.6%
Lowercase Letter
ValueCountFrequency (%)
b53858
19.3%
a52592
18.8%
e43795
15.7%
d43532
15.6%
c42896
15.4%
f42336
15.2%

Most occurring scripts

ValueCountFrequency (%)
Common489511
63.7%
Latin279009
36.3%

Most frequent character per script

Common
ValueCountFrequency (%)
858032
11.9%
453885
11.0%
251523
10.5%
150861
10.4%
649539
10.1%
048381
9.9%
348213
9.8%
543818
9.0%
943003
8.8%
742256
8.6%
Latin
ValueCountFrequency (%)
b53858
19.3%
a52592
18.8%
e43795
15.7%
d43532
15.6%
c42896
15.4%
f42336
15.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII768520
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
858032
 
7.6%
453885
 
7.0%
b53858
 
7.0%
a52592
 
6.8%
251523
 
6.7%
150861
 
6.6%
649539
 
6.4%
048381
 
6.3%
348213
 
6.3%
543818
 
5.7%
Other values (6)257818
33.5%

C_17
Categorical

HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.4 KiB
e5ba7672
42705 
07c540c4
13033 
d4bb7bd8
12639 
3486227d
7548 
776ce399
6418 
Other values (5)
17657 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters800000
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowe5ba7672
2nd row07c540c4
3rd row8efede7f
4th row1e88c74f
5th row1e88c74f

Common Values

ValueCountFrequency (%)
e5ba767242705
42.7%
07c540c413033
 
13.0%
d4bb7bd812639
 
12.6%
3486227d7548
 
7.5%
776ce3996418
 
6.4%
2005abd14760
 
4.8%
1e88c74f4559
 
4.6%
27c07bd64199
 
4.2%
8efede7f4138
 
4.1%
af5d780c1
 
< 0.1%

Length

2022-08-03T20:40:08.228159image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-03T20:40:08.266038image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
e5ba767242705
42.7%
07c540c413033
 
13.0%
d4bb7bd812639
 
12.6%
3486227d7548
 
7.5%
776ce3996418
 
6.4%
2005abd14760
 
4.8%
1e88c74f4559
 
4.6%
27c07bd64199
 
4.2%
8efede7f4138
 
4.1%
af5d780c1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
7148562
18.6%
b89581
11.2%
266760
8.3%
e66096
8.3%
660870
7.6%
560499
7.6%
450812
 
6.4%
a47466
 
5.9%
d45924
 
5.7%
c41243
 
5.2%
Other values (6)122187
15.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number496854
62.1%
Lowercase Letter303146
37.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7148562
29.9%
266760
13.4%
660870
12.3%
560499
12.2%
450812
 
10.2%
039786
 
8.0%
833444
 
6.7%
313966
 
2.8%
912836
 
2.6%
19319
 
1.9%
Lowercase Letter
ValueCountFrequency (%)
b89581
29.6%
e66096
21.8%
a47466
15.7%
d45924
15.1%
c41243
13.6%
f12836
 
4.2%

Most occurring scripts

ValueCountFrequency (%)
Common496854
62.1%
Latin303146
37.9%

Most frequent character per script

Common
ValueCountFrequency (%)
7148562
29.9%
266760
13.4%
660870
12.3%
560499
12.2%
450812
 
10.2%
039786
 
8.0%
833444
 
6.7%
313966
 
2.8%
912836
 
2.6%
19319
 
1.9%
Latin
ValueCountFrequency (%)
b89581
29.6%
e66096
21.8%
a47466
15.7%
d45924
15.1%
c41243
13.6%
f12836
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII800000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7148562
18.6%
b89581
11.2%
266760
8.3%
e66096
8.3%
660870
7.6%
560499
7.6%
450812
 
6.4%
a47466
 
5.9%
d45924
 
5.7%
c41243
 
5.2%
Other values (6)122187
15.3%

C_18
Categorical

HIGH CARDINALITY

Distinct2548
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size781.4 KiB
5aed7436
 
5348
891589e7
 
2879
e88ffc9d
 
2590
582152eb
 
2156
7ef5affa
 
2133
Other values (2543)
84894 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters800000
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique647 ?
Unique (%)0.6%

Sample

1st rowf54016b9
2nd rowb04e4670
3rd row3412118d
4th row74ef3502
5th row26b3c7a7

Common Values

ValueCountFrequency (%)
5aed74365348
 
5.3%
891589e72879
 
2.9%
e88ffc9d2590
 
2.6%
582152eb2156
 
2.2%
7ef5affa2133
 
2.1%
c21c3e4c1757
 
1.8%
1f868fdd1735
 
1.7%
005c67401735
 
1.7%
f54016b91731
 
1.7%
bd17c3da1570
 
1.6%
Other values (2538)76366
76.4%

Length

2022-08-03T20:40:08.372949image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
5aed74365348
 
5.3%
891589e72879
 
2.9%
e88ffc9d2590
 
2.6%
582152eb2156
 
2.2%
7ef5affa2133
 
2.1%
c21c3e4c1757
 
1.8%
1f868fdd1735
 
1.7%
005c67401735
 
1.7%
f54016b91731
 
1.7%
bd17c3da1570
 
1.6%
Other values (2538)76366
76.4%

Most occurring characters

ValueCountFrequency (%)
f63816
 
8.0%
858086
 
7.3%
e55692
 
7.0%
554887
 
6.9%
453229
 
6.7%
d50432
 
6.3%
650001
 
6.3%
349439
 
6.2%
c47333
 
5.9%
747210
 
5.9%
Other values (6)269875
33.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number491537
61.4%
Lowercase Letter308463
38.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
858086
11.8%
554887
11.2%
453229
10.8%
650001
10.2%
349439
10.1%
747210
9.6%
947131
9.6%
146826
9.5%
043302
8.8%
241426
8.4%
Lowercase Letter
ValueCountFrequency (%)
f63816
20.7%
e55692
18.1%
d50432
16.3%
c47333
15.3%
a45888
14.9%
b45302
14.7%

Most occurring scripts

ValueCountFrequency (%)
Common491537
61.4%
Latin308463
38.6%

Most frequent character per script

Common
ValueCountFrequency (%)
858086
11.8%
554887
11.2%
453229
10.8%
650001
10.2%
349439
10.1%
747210
9.6%
947131
9.6%
146826
9.5%
043302
8.8%
241426
8.4%
Latin
ValueCountFrequency (%)
f63816
20.7%
e55692
18.1%
d50432
16.3%
c47333
15.3%
a45888
14.9%
b45302
14.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII800000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
f63816
 
8.0%
858086
 
7.3%
e55692
 
7.0%
554887
 
6.9%
453229
 
6.7%
d50432
 
6.3%
650001
 
6.3%
349439
 
6.2%
c47333
 
5.9%
747210
 
5.9%
Other values (6)269875
33.7%

C_19
Categorical

HIGH CARDINALITY
MISSING

Distinct1302
Distinct (%)2.2%
Missing41471
Missing (%)41.5%
Memory size781.4 KiB
21ddcdc9
34151 
55dd3565
 
2385
cf99e5de
 
1215
9437f62f
 
904
1d1eb838
 
597
Other values (1297)
19277 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters468232
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique331 ?
Unique (%)0.6%

Sample

1st row21ddcdc9
2nd row21ddcdc9
3rd row21ddcdc9
4th row18259a83
5th row21ddcdc9

Common Values

ValueCountFrequency (%)
21ddcdc934151
34.2%
55dd35652385
 
2.4%
cf99e5de1215
 
1.2%
9437f62f904
 
0.9%
1d1eb838597
 
0.6%
5b885066489
 
0.5%
712d530c443
 
0.4%
1d04f4a4417
 
0.4%
9653bb65363
 
0.4%
a153cea2362
 
0.4%
Other values (1292)17203
17.2%
(Missing)41471
41.5%

Length

2022-08-03T20:40:08.404149image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
21ddcdc934151
58.3%
55dd35652385
 
4.1%
cf99e5de1215
 
2.1%
9437f62f904
 
1.5%
1d1eb838597
 
1.0%
5b885066489
 
0.8%
712d530c443
 
0.8%
1d04f4a4417
 
0.7%
9653bb65363
 
0.6%
a153cea2362
 
0.6%
Other values (1292)17203
29.4%

Most occurring characters

ValueCountFrequency (%)
d117514
25.1%
c78144
16.7%
945919
 
9.8%
244693
 
9.5%
144166
 
9.4%
522052
 
4.7%
315014
 
3.2%
f14115
 
3.0%
612472
 
2.7%
412398
 
2.6%
Other values (6)61745
13.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter240439
51.4%
Decimal Number227793
48.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
945919
20.2%
244693
19.6%
144166
19.4%
522052
9.7%
315014
 
6.6%
612472
 
5.5%
412398
 
5.4%
711286
 
5.0%
89936
 
4.4%
09857
 
4.3%
Lowercase Letter
ValueCountFrequency (%)
d117514
48.9%
c78144
32.5%
f14115
 
5.9%
e11765
 
4.9%
b10521
 
4.4%
a8380
 
3.5%

Most occurring scripts

ValueCountFrequency (%)
Latin240439
51.4%
Common227793
48.6%

Most frequent character per script

Common
ValueCountFrequency (%)
945919
20.2%
244693
19.6%
144166
19.4%
522052
9.7%
315014
 
6.6%
612472
 
5.5%
412398
 
5.4%
711286
 
5.0%
89936
 
4.4%
09857
 
4.3%
Latin
ValueCountFrequency (%)
d117514
48.9%
c78144
32.5%
f14115
 
5.9%
e11765
 
4.9%
b10521
 
4.4%
a8380
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII468232
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
d117514
25.1%
c78144
16.7%
945919
 
9.8%
244693
 
9.5%
144166
 
9.4%
522052
 
4.7%
315014
 
3.2%
f14115
 
3.0%
612472
 
2.7%
412398
 
2.6%
Other values (6)61745
13.2%

C_20
Categorical

MISSING

Distinct3
Distinct (%)< 0.1%
Missing41471
Missing (%)41.5%
Memory size781.4 KiB
5840adea
21517 
a458ea53
18983 
b1252a9d
18029 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters468232
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowb1252a9d
2nd row5840adea
3rd row5840adea
4th rowa458ea53
5th rowb1252a9d

Common Values

ValueCountFrequency (%)
5840adea21517
21.5%
a458ea5318983
19.0%
b1252a9d18029
18.0%
(Missing)41471
41.5%

Length

2022-08-03T20:40:08.435408image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-03T20:40:08.468429image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
5840adea21517
36.8%
a458ea5318983
32.4%
b1252a9d18029
30.8%

Most occurring characters

ValueCountFrequency (%)
a99029
21.1%
577512
16.6%
840500
8.6%
440500
8.6%
e40500
8.6%
d39546
 
8.4%
236058
 
7.7%
021517
 
4.6%
318983
 
4.1%
b18029
 
3.9%
Other values (2)36058
 
7.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number271128
57.9%
Lowercase Letter197104
42.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
577512
28.6%
840500
14.9%
440500
14.9%
236058
13.3%
021517
 
7.9%
318983
 
7.0%
118029
 
6.6%
918029
 
6.6%
Lowercase Letter
ValueCountFrequency (%)
a99029
50.2%
e40500
20.5%
d39546
 
20.1%
b18029
 
9.1%

Most occurring scripts

ValueCountFrequency (%)
Common271128
57.9%
Latin197104
42.1%

Most frequent character per script

Common
ValueCountFrequency (%)
577512
28.6%
840500
14.9%
440500
14.9%
236058
13.3%
021517
 
7.9%
318983
 
7.0%
118029
 
6.6%
918029
 
6.6%
Latin
ValueCountFrequency (%)
a99029
50.2%
e40500
20.5%
d39546
 
20.1%
b18029
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII468232
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a99029
21.1%
577512
16.6%
840500
8.6%
440500
8.6%
e40500
8.6%
d39546
 
8.4%
236058
 
7.7%
021517
 
4.6%
318983
 
4.1%
b18029
 
3.9%
Other values (2)36058
 
7.7%

C_21
Categorical

HIGH CARDINALITY
MISSING

Distinct38617
Distinct (%)40.2%
Missing3935
Missing (%)3.9%
Memory size781.4 KiB
0014c32a
 
3703
73d06dde
 
1900
e587c466
 
1707
5f957280
 
1579
dfcfc3fa
 
1571
Other values (38612)
85605 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters768520
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32439 ?
Unique (%)33.8%

Sample

1st row07b5194c
2nd row60f6221e
3rd rowe587c466
4th row6b3a5ca6
5th row21c9516a

Common Values

ValueCountFrequency (%)
0014c32a3703
 
3.7%
73d06dde1900
 
1.9%
e587c4661707
 
1.7%
5f9572801579
 
1.6%
dfcfc3fa1571
 
1.6%
99c09e971058
 
1.1%
0429f84b972
 
1.0%
c2a93b37900
 
0.9%
723b4dfd895
 
0.9%
d4703ebd855
 
0.9%
Other values (38607)80925
80.9%
(Missing)3935
 
3.9%

Length

2022-08-03T20:40:08.498795image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0014c32a3703
 
3.9%
73d06dde1900
 
2.0%
e587c4661707
 
1.8%
5f9572801579
 
1.6%
dfcfc3fa1571
 
1.6%
99c09e971058
 
1.1%
0429f84b972
 
1.0%
c2a93b37900
 
0.9%
723b4dfd895
 
0.9%
d4703ebd855
 
0.9%
Other values (38607)80925
84.2%

Most occurring characters

ValueCountFrequency (%)
054458
 
7.1%
d52773
 
6.9%
c51007
 
6.6%
750432
 
6.6%
350287
 
6.5%
448831
 
6.4%
948800
 
6.3%
248189
 
6.3%
f47985
 
6.2%
a47011
 
6.1%
Other values (6)268747
35.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number479536
62.4%
Lowercase Letter288984
37.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
054458
11.4%
750432
10.5%
350287
10.5%
448831
10.2%
948800
10.2%
248189
10.0%
145266
9.4%
645168
9.4%
545013
9.4%
843092
9.0%
Lowercase Letter
ValueCountFrequency (%)
d52773
18.3%
c51007
17.7%
f47985
16.6%
a47011
16.3%
e46732
16.2%
b43476
15.0%

Most occurring scripts

ValueCountFrequency (%)
Common479536
62.4%
Latin288984
37.6%

Most frequent character per script

Common
ValueCountFrequency (%)
054458
11.4%
750432
10.5%
350287
10.5%
448831
10.2%
948800
10.2%
248189
10.0%
145266
9.4%
645168
9.4%
545013
9.4%
843092
9.0%
Latin
ValueCountFrequency (%)
d52773
18.3%
c51007
17.7%
f47985
16.6%
a47011
16.3%
e46732
16.2%
b43476
15.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII768520
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
054458
 
7.1%
d52773
 
6.9%
c51007
 
6.6%
750432
 
6.6%
350287
 
6.5%
448831
 
6.4%
948800
 
6.3%
248189
 
6.3%
f47985
 
6.2%
a47011
 
6.1%
Other values (6)268747
35.0%

C_22
Categorical

HIGH CORRELATION
MISSING

Distinct10
Distinct (%)0.1%
Missing81566
Missing (%)81.6%
Memory size781.4 KiB
ad3062eb
9640 
c9d4222a
7023 
8ec974f4
 
751
78e2e389
 
582
c0061c6d
 
402
Other values (5)
 
36

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters147472
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowad3062eb
2nd row8ec974f4
3rd rowad3062eb
4th rowc9d4222a
5th rowad3062eb

Common Values

ValueCountFrequency (%)
ad3062eb9640
 
9.6%
c9d4222a7023
 
7.0%
8ec974f4751
 
0.8%
78e2e389582
 
0.6%
c0061c6d402
 
0.4%
8651fddb13
 
< 0.1%
ccfd400211
 
< 0.1%
49e825c57
 
< 0.1%
28f453083
 
< 0.1%
d9ce18382
 
< 0.1%
(Missing)81566
81.6%

Length

2022-08-03T20:40:08.530102image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-03T20:40:08.570219image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
ad3062eb9640
52.3%
c9d4222a7023
38.1%
8ec974f4751
 
4.1%
78e2e389582
 
3.2%
c0061c6d402
 
2.2%
8651fddb13
 
0.1%
ccfd400211
 
0.1%
49e825c57
 
< 0.1%
28f453083
 
< 0.1%
d9ce18382
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
231312
21.2%
d17104
11.6%
a16663
11.3%
e11564
 
7.8%
010469
 
7.1%
610457
 
7.1%
310227
 
6.9%
b9653
 
6.5%
c8609
 
5.8%
48546
 
5.8%
Other values (6)12868
8.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number83101
56.4%
Lowercase Letter64371
43.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
231312
37.7%
010469
 
12.6%
610457
 
12.6%
310227
 
12.3%
48546
 
10.3%
98365
 
10.1%
81945
 
2.3%
71333
 
1.6%
1417
 
0.5%
530
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
d17104
26.6%
a16663
25.9%
e11564
18.0%
b9653
15.0%
c8609
13.4%
f778
 
1.2%

Most occurring scripts

ValueCountFrequency (%)
Common83101
56.4%
Latin64371
43.6%

Most frequent character per script

Common
ValueCountFrequency (%)
231312
37.7%
010469
 
12.6%
610457
 
12.6%
310227
 
12.3%
48546
 
10.3%
98365
 
10.1%
81945
 
2.3%
71333
 
1.6%
1417
 
0.5%
530
 
< 0.1%
Latin
ValueCountFrequency (%)
d17104
26.6%
a16663
25.9%
e11564
18.0%
b9653
15.0%
c8609
13.4%
f778
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII147472
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
231312
21.2%
d17104
11.6%
a16663
11.3%
e11564
 
7.8%
010469
 
7.1%
610457
 
7.1%
310227
 
6.9%
b9653
 
6.5%
c8609
 
5.8%
48546
 
5.8%
Other values (6)12868
8.7%

C_23
Categorical

HIGH CORRELATION

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.4 KiB
32c7478e
43970 
3a171ecb
19944 
423fab69
11383 
be7c41b4
7079 
bcdee96c
6835 
Other values (9)
10789 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters800000
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3a171ecb
2nd row3a171ecb
3rd row3a171ecb
4th row3a171ecb
5th row32c7478e

Common Values

ValueCountFrequency (%)
32c7478e43970
44.0%
3a171ecb19944
19.9%
423fab6911383
 
11.4%
be7c41b47079
 
7.1%
bcdee96c6835
 
6.8%
c7dc67204522
 
4.5%
55dd35653029
 
3.0%
dbb486d71396
 
1.4%
93bad2c0988
 
1.0%
c3dc6cef385
 
0.4%
Other values (4)469
 
0.5%

Length

2022-08-03T20:40:08.608657image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
32c7478e43970
44.0%
3a171ecb19944
19.9%
423fab6911383
 
11.4%
be7c41b47079
 
7.1%
bcdee96c6835
 
6.8%
c7dc67204522
 
4.5%
55dd35653029
 
3.0%
dbb486d71396
 
1.4%
93bad2c0988
 
1.0%
c3dc6cef385
 
0.4%
Other values (4)469
 
0.5%

Most occurring characters

ValueCountFrequency (%)
7125430
15.7%
c95869
12.0%
e85067
10.6%
379718
10.0%
471029
8.9%
261020
7.6%
b56241
7.0%
146967
 
5.9%
845686
 
5.7%
a32757
 
4.1%
Other values (6)100216
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number496398
62.0%
Lowercase Letter303602
38.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7125430
25.3%
379718
16.1%
471029
14.3%
261020
12.3%
146967
 
9.5%
845686
 
9.2%
627813
 
5.6%
919870
 
4.0%
513111
 
2.6%
05754
 
1.2%
Lowercase Letter
ValueCountFrequency (%)
c95869
31.6%
e85067
28.0%
b56241
18.5%
a32757
 
10.8%
d21900
 
7.2%
f11768
 
3.9%

Most occurring scripts

ValueCountFrequency (%)
Common496398
62.0%
Latin303602
38.0%

Most frequent character per script

Common
ValueCountFrequency (%)
7125430
25.3%
379718
16.1%
471029
14.3%
261020
12.3%
146967
 
9.5%
845686
 
9.2%
627813
 
5.6%
919870
 
4.0%
513111
 
2.6%
05754
 
1.2%
Latin
ValueCountFrequency (%)
c95869
31.6%
e85067
28.0%
b56241
18.5%
a32757
 
10.8%
d21900
 
7.2%
f11768
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII800000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7125430
15.7%
c95869
12.0%
e85067
10.6%
379718
10.0%
471029
8.9%
261020
7.6%
b56241
7.0%
146967
 
5.9%
845686
 
5.7%
a32757
 
4.1%
Other values (6)100216
12.5%

C_24
Categorical

HIGH CARDINALITY
MISSING

Distinct12334
Distinct (%)12.8%
Missing3935
Missing (%)3.9%
Memory size781.4 KiB
1793a828
6941 
3fdb382b
 
6504
3b183c5c
 
5991
aee52b6f
 
4519
b34f3128
 
3441
Other values (12329)
68669 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters768520
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7266 ?
Unique (%)7.6%

Sample

1st rowc5c50484
2nd row43f13e8b
3rd row3b183c5c
4th row9117a34a
5th rowb34f3128

Common Values

ValueCountFrequency (%)
1793a8286941
 
6.9%
3fdb382b6504
 
6.5%
3b183c5c5991
 
6.0%
aee52b6f4519
 
4.5%
b34f31283441
 
3.4%
45ab94c82851
 
2.9%
9117a34a1479
 
1.5%
335a6a1e1075
 
1.1%
c0d61a5c972
 
1.0%
ded4aac9833
 
0.8%
Other values (12324)61459
61.5%
(Missing)3935
 
3.9%

Length

2022-08-03T20:40:08.639302image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1793a8286941
 
7.2%
3fdb382b6504
 
6.8%
3b183c5c5991
 
6.2%
aee52b6f4519
 
4.7%
b34f31283441
 
3.6%
45ab94c82851
 
3.0%
9117a34a1479
 
1.5%
335a6a1e1075
 
1.1%
c0d61a5c972
 
1.0%
ded4aac9833
 
0.9%
Other values (12324)61459
64.0%

Most occurring characters

ValueCountFrequency (%)
371194
 
9.3%
867821
 
8.8%
b60114
 
7.8%
a56413
 
7.3%
249319
 
6.4%
c48965
 
6.4%
148917
 
6.4%
547239
 
6.1%
f47199
 
6.1%
442939
 
5.6%
Other values (6)228400
29.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number474344
61.7%
Lowercase Letter294176
38.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
371194
15.0%
867821
14.3%
249319
10.4%
148917
10.3%
547239
10.0%
442939
9.1%
941260
8.7%
738555
8.1%
637839
8.0%
029261
6.2%
Lowercase Letter
ValueCountFrequency (%)
b60114
20.4%
a56413
19.2%
c48965
16.6%
f47199
16.0%
e41717
14.2%
d39768
13.5%

Most occurring scripts

ValueCountFrequency (%)
Common474344
61.7%
Latin294176
38.3%

Most frequent character per script

Common
ValueCountFrequency (%)
371194
15.0%
867821
14.3%
249319
10.4%
148917
10.3%
547239
10.0%
442939
9.1%
941260
8.7%
738555
8.1%
637839
8.0%
029261
6.2%
Latin
ValueCountFrequency (%)
b60114
20.4%
a56413
19.2%
c48965
16.6%
f47199
16.0%
e41717
14.2%
d39768
13.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII768520
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
371194
 
9.3%
867821
 
8.8%
b60114
 
7.8%
a56413
 
7.3%
249319
 
6.4%
c48965
 
6.4%
148917
 
6.4%
547239
 
6.1%
f47199
 
6.1%
442939
 
5.6%
Other values (6)228400
29.7%

C_25
Categorical

MISSING

Distinct50
Distinct (%)0.1%
Missing41471
Missing (%)41.5%
Memory size781.4 KiB
e8b83407
18487 
001f3601
11451 
ea9a246c
5976 
010f6491
3806 
cb079c2d
3291 
Other values (45)
15518 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters468232
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowe8b83407
2nd rowe8b83407
3rd rowea9a246c
4th row9b3e8820
5th rowe8b83407

Common Values

ValueCountFrequency (%)
e8b8340718487
18.5%
001f360111451
 
11.5%
ea9a246c5976
 
6.0%
010f64913806
 
3.8%
cb079c2d3291
 
3.3%
9b3e88202900
 
2.9%
2bf691b12209
 
2.2%
445bbe3b1768
 
1.8%
f0f449dd1494
 
1.5%
9d93af031239
 
1.2%
Other values (40)5908
 
5.9%
(Missing)41471
41.5%

Length

2022-08-03T20:40:08.670247image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
e8b8340718487
31.6%
001f360111451
19.6%
ea9a246c5976
 
10.2%
010f64913806
 
6.5%
cb079c2d3291
 
5.6%
9b3e88202900
 
5.0%
2bf691b12209
 
3.8%
445bbe3b1768
 
3.0%
f0f449dd1494
 
2.6%
9d93af031239
 
2.1%
Other values (40)5908
 
10.1%

Most occurring characters

ValueCountFrequency (%)
073198
15.6%
843525
9.3%
439463
8.4%
b38947
8.3%
338898
8.3%
135695
7.6%
e30799
 
6.6%
627514
 
5.9%
726926
 
5.8%
925107
 
5.4%
Other values (6)88160
18.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number332409
71.0%
Lowercase Letter135823
29.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
073198
22.0%
843525
13.1%
439463
11.9%
338898
11.7%
135695
10.7%
627514
 
8.3%
726926
 
8.1%
925107
 
7.6%
217891
 
5.4%
54192
 
1.3%
Lowercase Letter
ValueCountFrequency (%)
b38947
28.7%
e30799
22.7%
f24867
18.3%
a15857
11.7%
c14355
 
10.6%
d10998
 
8.1%

Most occurring scripts

ValueCountFrequency (%)
Common332409
71.0%
Latin135823
29.0%

Most frequent character per script

Common
ValueCountFrequency (%)
073198
22.0%
843525
13.1%
439463
11.9%
338898
11.7%
135695
10.7%
627514
 
8.3%
726926
 
8.1%
925107
 
7.6%
217891
 
5.4%
54192
 
1.3%
Latin
ValueCountFrequency (%)
b38947
28.7%
e30799
22.7%
f24867
18.3%
a15857
11.7%
c14355
 
10.6%
d10998
 
8.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII468232
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
073198
15.6%
843525
9.3%
439463
8.4%
b38947
8.3%
338898
8.3%
135695
7.6%
e30799
 
6.6%
627514
 
5.9%
726926
 
5.8%
925107
 
5.4%
Other values (6)88160
18.8%

C_26
Categorical

HIGH CARDINALITY
MISSING

Distinct9526
Distinct (%)16.3%
Missing41471
Missing (%)41.5%
Memory size781.4 KiB
49d68486
5046 
c84c4aec
 
2330
2fede552
 
1574
984e0db0
 
1523
b7d9c3bc
 
1299
Other values (9521)
46757 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters468232
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5614 ?
Unique (%)9.6%

Sample

1st row9727dd16
2nd row731c3655
3rd row9a556cfc
4th row8967c0d2
5th rowc43c3f58

Common Values

ValueCountFrequency (%)
49d684865046
 
5.0%
c84c4aec2330
 
2.3%
2fede5521574
 
1.6%
984e0db01523
 
1.5%
b7d9c3bc1299
 
1.3%
aa5f0a151271
 
1.3%
c27f155b944
 
0.9%
9904c656883
 
0.9%
b9809574864
 
0.9%
56be3401743
 
0.7%
Other values (9516)42052
42.1%
(Missing)41471
41.5%

Length

2022-08-03T20:40:08.701837image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
49d684865046
 
8.6%
c84c4aec2330
 
4.0%
2fede5521574
 
2.7%
984e0db01523
 
2.6%
b7d9c3bc1299
 
2.2%
aa5f0a151271
 
2.2%
c27f155b944
 
1.6%
9904c656883
 
1.5%
b9809574864
 
1.5%
56be3401743
 
1.3%
Other values (9516)42052
71.8%

Most occurring characters

ValueCountFrequency (%)
441320
 
8.8%
836624
 
7.8%
634399
 
7.3%
c33769
 
7.2%
932004
 
6.8%
d31140
 
6.7%
530961
 
6.6%
a27815
 
5.9%
027498
 
5.9%
e26687
 
5.7%
Other values (6)146015
31.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number297685
63.6%
Lowercase Letter170547
36.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
441320
13.9%
836624
12.3%
634399
11.6%
932004
10.8%
530961
10.4%
027498
9.2%
124269
8.2%
324082
8.1%
723794
8.0%
222734
7.6%
Lowercase Letter
ValueCountFrequency (%)
c33769
19.8%
d31140
18.3%
a27815
16.3%
e26687
15.6%
b26175
15.3%
f24961
14.6%

Most occurring scripts

ValueCountFrequency (%)
Common297685
63.6%
Latin170547
36.4%

Most frequent character per script

Common
ValueCountFrequency (%)
441320
13.9%
836624
12.3%
634399
11.6%
932004
10.8%
530961
10.4%
027498
9.2%
124269
8.2%
324082
8.1%
723794
8.0%
222734
7.6%
Latin
ValueCountFrequency (%)
c33769
19.8%
d31140
18.3%
a27815
16.3%
e26687
15.6%
b26175
15.3%
f24961
14.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII468232
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
441320
 
8.8%
836624
 
7.8%
634399
 
7.3%
c33769
 
7.2%
932004
 
6.8%
d31140
 
6.7%
530961
 
6.6%
a27815
 
5.9%
027498
 
5.9%
e26687
 
5.7%
Other values (6)146015
31.2%

Interactions

2022-08-03T20:40:03.479720image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:56.422507image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:57.103541image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:57.675458image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:58.362676image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:58.907899image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:59.488256image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:00.128732image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:00.682885image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:01.299775image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:01.859377image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:02.461341image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:03.001394image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:03.521182image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:56.490261image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:57.147573image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:57.717231image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:58.405816image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:58.955381image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:59.534033image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:00.172059image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:00.724294image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:01.343850image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:01.899221image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:02.503602image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:03.037499image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:03.637303image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:56.559382image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:57.193043image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:57.758498image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:58.449007image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:59.003464image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:59.581272image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:00.214984image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:00.766185image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:01.388594image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:01.940950image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:02.546303image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:03.078414image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:03.681107image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:56.612670image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:57.235581image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:57.802396image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:58.497543image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:59.050490image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:59.696705image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:00.257276image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:00.807193image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:01.432416image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:01.981692image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:02.589029image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:03.118084image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:03.722978image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:56.664539image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:57.279202image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:57.844358image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:58.539211image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:59.093777image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:59.739260image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:00.299025image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:00.848783image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:01.475709image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:02.020603image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:02.630548image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:03.153800image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:03.765357image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:56.713881image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:57.325445image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:57.894431image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:58.583979image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:59.139236image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:59.782891image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:00.346702image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:00.893287image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:01.521554image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:02.064343image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:02.675584image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:03.190362image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:03.809255image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:56.757209image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:57.374672image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:57.942949image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:58.628709image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:59.184246image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:59.831648image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:00.396431image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:00.938331image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:01.568789image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:02.108598image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:02.720932image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:03.225629image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:03.849621image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:56.796041image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:57.419086image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:57.984764image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:58.669655image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:59.227742image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:59.874743image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:00.440342image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:01.050492image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:01.611491image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:02.148358image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:02.762216image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:03.260957image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:03.889735image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:56.836145image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:57.462003image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:58.025397image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:58.709678image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:59.271200image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:59.918940image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:00.481264image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:01.092123image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:01.653455image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:02.188711image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:02.802760image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:03.297832image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:03.930195image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:56.947479image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:57.504184image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:58.067015image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:58.748566image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:59.314005image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:59.961678image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:00.521115image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:01.134141image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:01.694198image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:02.230387image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:02.843370image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:03.334715image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:03.971412image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:56.988502image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:57.549219image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:58.110853image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:58.789742image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:59.359750image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:00.006412image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:00.564826image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:01.179078image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:01.739256image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:02.271486image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:02.887349image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:03.371659image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:04.008400image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:57.024933image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:57.589064image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:58.147984image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:58.825681image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:59.400231image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:00.045237image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:00.601427image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:01.216331image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:01.776406image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:02.381934image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:02.923373image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:03.407270image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:04.048196image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:57.064103image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:57.633440image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:58.318806image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:58.866685image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:39:59.444977image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:00.085189image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:00.642314image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:01.258123image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:01.817449image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:02.421481image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:02.964468image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-03T20:40:03.442725image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Correlations

2022-08-03T20:40:08.736285image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-08-03T20:40:08.800639image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-08-03T20:40:08.864208image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-08-03T20:40:08.924888image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-08-03T20:40:08.985253image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-08-03T20:40:04.292576image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-08-03T20:40:04.884007image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-08-03T20:40:05.920572image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-08-03T20:40:06.237954image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

labelI_1I_2I_3I_4I_5I_6I_7I_8I_9I_10I_11I_12I_13C_1C_2C_3C_4C_5C_6C_7C_8C_9C_10C_11C_12C_13C_14C_15C_16C_17C_18C_19C_20C_21C_22C_23C_24C_25C_26
001.01.05.00.01382.04.015.02.0181.01.02.0NaN2.068fd1e6480e26c9bfb9361367b4723c425c83c987e0ccccfde7995b81f89b562a73ee510a8cd5504b2cb9c9837c9c1642824a5f61adce6ef8ba8b39a891b62e7e5ba7672f54016b921ddcdc9b1252a9d07b5194cNaN3a171ecbc5c50484e8b834079727dd16
102.00.044.01.0102.08.02.02.04.01.01.0NaN4.068fd1e64f0cf00246f67f7e541274cd725c83c98fe6b92e5922afcc00b153874a73ee5102b53e5fb4f1b46f3623049e6d7020589b28479f6e6c5b5cdc92f3b6107c540c4b04e467021ddcdc95840adea60f6221eNaN3a171ecb43f13e8be8b83407731c3655
202.00.01.014.0767.089.04.02.0245.01.03.03.045.0287e684f0a519c5c02cf9876c18be18125c83c987e0ccccfc78204a10b153874a73ee5103b08e48b5f5e60918fe001f4aa655a2f07d13a8f6dc710ed361034588efede7f3412118dNaNNaNe587c466ad3062eb3a171ecb3b183c5cNaNNaN
30NaN893.0NaNNaN4392.0NaN0.00.00.0NaN0.0NaNNaN68fd1e642c16a946a9a87e682e17d6f625c83c98fe6b92e52e8a689b0b153874a73ee510efea433be51ddf94a30567ca3516f6e607d13a8f1823122452b8680f1e88c74f74ef3502NaNNaN6b3a5ca6NaN3a171ecb9117a34aNaNNaN
403.0-1.0NaN0.02.00.03.00.00.01.01.0NaN0.08cf07265ae46a29dc81688bbf922efad25c83c9813718bbdad9fa2550b153874a73ee5105282c137e5d8af5766a76a26f06c53ac1adce6ef8ff4b40301adbab41e88c74f26b3c7a7NaNNaN21c9516aNaN32c7478eb34f3128NaNNaN
50NaN-1.0NaNNaN12824.0NaN0.00.06.0NaN0.0NaNNaN05db91646c9c9cf32730ec9c5400db8b43b193496f6d9be853b5f9780b153874a73ee5103b08e48b91e8fc27be45b8779ff13f2207d13a8f06969a209bc7fff5776ce39992555263NaNNaN242bb7108ec974f4be7c41b472c78f11NaNNaN
60NaN1.02.0NaN3168.0NaN0.01.02.0NaN0.0NaNNaN439a44a4ad4527a2c02372d0d34ebbaa43b19349fe6b92e54bc6ffea0b153874a73ee5103b08e48ba4609aab14d63538772a00d707d13a8ff9d1382eb00d3dc9776ce399cdfa8259NaNNaN20062612NaN93bad2c01b256e61NaNNaN
711.04.02.00.00.00.01.00.00.01.01.0NaN0.068fd1e642c16a946503b9dbce4dbea90f347412913718bbd38eb9cf41f89b562a73ee510547c0ffebc8c9f2160ab2f0746f42a6307d13a8f18231224e6b6bdc7e5ba767274ef3502NaNNaN5316a17fNaN32c7478e9117a34aNaNNaN
80NaN44.04.08.019010.0249.028.031.0141.0NaN1.0NaN8.005db9164d833535fd032c263c18be18125c83c987e0ccccfd5b6acf20b153874a73ee5102acdcf4e086ac2d2dfbb09fb41a6ae00b28479f6e2502ec984898b2ae5ba767242a2edb9NaNNaN0014c32aNaN32c7478e3b183c5cNaNNaN
90NaN35.0NaN1.033737.021.01.02.03.0NaN1.0NaN1.005db9164510b40a5d03e7c24eb1fd92825c83c98NaN52283d1c0b153874a73ee510015ac893e51ddf94951fe4a93516f6e607d13a8f2ae4121c8ec71479d4bb7bd870d0f5f9NaNNaN0e63fca0NaN32c7478e0e8fe315NaNNaN

Last rows

labelI_1I_2I_3I_4I_5I_6I_7I_8I_9I_10I_11I_12I_13C_1C_2C_3C_4C_5C_6C_7C_8C_9C_10C_11C_12C_13C_14C_15C_16C_17C_18C_19C_20C_21C_22C_23C_24C_25C_26
9999013.0113.01.03.01.00.03.015.016.02.02.0NaN0.08cf0726508d6d8999143c832f56b7dd5384874ce7e0ccccfe702f4b937e4aa92a73ee510c69c38d42bcfb78fae1bb660e6fc496db28479f6bffbd637bad5ee1807c540c4bbf70d82NaNNaN0429f84bNaNbcdee96cc0d61a5cNaNNaN
999910NaN0.070.05.03732.0138.06.09.033.0NaN2.00.05.09a89b36c287130e0fba5f98bfa03b78f25c83c98NaN4813a1290b153874a73ee510556d5fe3e3baf8d40423ff45bc1e82c607d13a8f100406562fa6a0933486227d891589e77e57d0f3b1252a9d66a4872aNaNc7dc67208d17c565724b04da36447d5d
9999200.00.01.0NaN3153.066.03.03.0142.00.01.0NaNNaN75ac2fe66887a43c9b792af99c6d05a025c83c98fe6b92e560d4eb860b153874a73ee510392644bf0ad37b4b6532318cf9d99d818ceecbc84e06592a2c9d222fe5ba76728f0f692f21ddcdc9b1252a9dcc6a9262NaN32c7478ea5862ce8445bbe3bc8816bd2
99993014.021.039.015.0843.095.014.034.046.01.01.00.036.005db9164e77e5e6e12da0bba620fad9925c83c98fe6b92e59fe79dba0b153874a73ee5109bc1a7c1e4034ebfaf3d699cea089f5d1adce6ef2c684cfdbe928393e5ba7672449d670521ddcdc95840adeabacd3941NaN32c7478e3fdb382b001f3601df5cbf86
9999400.01.08.02.02482.052.05.010.0192.00.02.0NaN2.005db91646887a43c88f89572f63c1df925c83c98NaNf398f2de0b153874a73ee510ad5994069842b91140aca95d6d15ad3307d13a8fdb12b98cd6220aabe5ba76721e4afada21ddcdc9a458ea535a7f3ae0NaN32c7478e9a1c7e3d445bbe3bd1312b91
9999501.060.037.00.01.00.04.00.023.01.03.0NaN0.068fd1e6491381efceacb18f6f00be89625c83c98fe6b92e52489e185985e3fcba73ee5103b08e48b4d1f7d97d6b4fe71954029f81adce6efa06029812a08cb76e5ba767238b82d9f21ddcdc9b1252a9d24e5131bNaN32c7478e4fa1630447907db572966777
999961NaN0.012.0NaN173121.0NaN0.03.010.0NaN0.0NaNNaN68fd1e64f0cf00246f67f7e541274cd74cf72387fbad5c969e5d36940b1538747cc72ec2be289b53699034a0623049e6a0edec241adce6ef55dc357bc92f3b61e5ba7672b04e467021ddcdc9b1252a9d60f6221eNaN32c7478e43f13e8bea9a246c731c3655
99997010.02.01.026.0482.060.010.011.060.01.01.00.059.08cf07265c8687797b063fe4e4b97246125c83c987e0ccccf176dc88ed7c4a8f5a73ee5103b08e48bd2b7c44b8cdc494168637c0db28479f6dc96c4b03084c78b07c540c4a7e0687421ddcdc9a458ea53514b7308NaN32c7478e2fd70e1c010f6491ec26ad35
999980NaN390.043.04.0345365.0NaN0.04.04.0NaN0.0NaN4.00e78bd46a796837edffca8ba0fa0d42325c83c98fe6b92e5bc32453651d76abe7cc72ec25cad63302bcfb78f93bab460e6fc496dcfef1c29f0bf90946bb29970e5ba76721cdbd1c5NaNNaNd9d9202fNaN32c7478e8fc66e78NaNNaN
9999900.0-1.0137.019.09504.0NaN0.022.022.00.00.0NaN22.005db916438d50e09dd17c91c82a6182025c83c987e0ccccfa088b3205b392875a73ee510323dec106be5122d75529ad8c8c105ca07d13a8fee569ce25eea53aae5ba7672582152eb21ddcdc95840adea0f78ab39c9d4222a32c7478ecafb4e4d001f360199f4f64c